Journal of Applied Psychology 


Joun G. Dartey, Editor 
University of Minnesota 





Table of Contents 


The Effects of Top and Middle Management Sets on the Ghiselli Self-Description Inventory: 
R. A. Kaufman, K. L. Hakmiller, and L. W. Porter 


Effects of Massed Practice and Thickness of Handcoverings on Manipulation with Gloves: 
Hilde Groth and J. Lyman 


AVA Validity for Textile Workers: P. F. Merenda and W. V. Clarke 


An Investigation of Some Aspects of the Social Psychological Impact of an Educational Tele- 
vision Program: J. J. Asher and R. I. Evans 


The Psychologist as an Instrument of Prediction: A. Trankell 


A Coding System for Total Profile Analysis of the Strong Vocational Interest Blank: J. O. 


Art Work Versus Photography: An Experimental Study: C. Winick 


Self-Perceptions of First-Level Supervisors Compared with Upper-Management Personnel and 
with Operative Line Workers: L. W. Porter 


Subliminal Perception: Some Negative Findings: A. D. Calvin and Karen S. Dollenmayer. .. 
Evaluation of Training in Creative Problem Solving: A. Meadow and S. J. Parnes 


Increasing Probability of Target Detection with a Mirror-Image Display: C. H. Baker and 
G. E. Boyes 


A Factorial Study of Dexterity Tests: G. L. Bourassa and R. M. Guion 


Interactions Between Display Gain and Task-Induced Stress in Manual Tracking: W. D. 
\Garvey and Jean B. Henson 


Effects of Feedback on Insight and Problem Solving Efficiency in Training Groups: E. E, 
Smith and S. S. Kight 


Personality Test Scores in the Management Hierarchy: Revisited: H. D. Meyer and A. J. 
Fredian 





American Psychological Association 


Volume 43, Number 3 June, 1959 





Consulting Editors 


Haroxtp E. Burtt, Ohio State University 

ALPHONSE CHAPANIS, Johns Hopkins Uni- 
versity 

Crirrorp E. JurRGENSEN, Minneapolis Gas 
Company 


Laurence S. McGaucuran, University of 
Houston 


Quinn McNemar, Stanford University 


nee _ Mintz, City College of New 

or 

Haroitp F. Rotnue, Fairbanks, Morse and 
Company 

Jutian B. Rotter, Ohio State University 

Tuomas A. Ryan, Cornell University 

Donatp E. Super, Columbia University 

Muzes A. TinKER, University of Minnesota 

Atrrep C. WELcH, University of New 
Mexico 





This journal gives primary consideration to origi- 
nal investigations in any field of applied psychol- 
ogy except clinical and consulting psychology, al- 
though a descriptive or theoretical article may be 
accepted if it represents a special contribution in 
an applied field. Quantitative investigations of in- 
terest or value to psychologists working in the fol- 
lowing broad fields will be considered: vocational 
and educational prognosis, diagnosis, and guidance 
at the secondary and college level; personnel re- 
search in business, industry, and government; bio- 
mechanics; industrial working conditions; research 
on opinion and morale factors; job analysis and 
classification research; market and advertising re- 
search, 


Because of the large number of manuscripts sub- 
mitted, authors should adhere to the rule of 


“brevity consistent with clarity.” The typical 
manuscript should run to approximately 4,000 
words. There is a lag of approximately twelve 
months between receipt and publication of an 
article. Authors may request advanced publica- 
tion if they are prepared to pay the cost of print- 
ing the necessary extra pages. 


Manuscripts should be addressed to the Editor, 
John G. Darley, 408 Johnston Hall, University of 
Minnesota, Minneapolis 14, Minnesota. All manu- 
scripts should be submitted in duplicate. Original 
figures are prepared for publication; duplicate fig- 
ures may be photographic or pencil-drawn copies. 


Manuscripts must conform to the style require- 
ments described in the Publication Manual of the 
American Psychological Association. 





Journal of Applied Psychology 


Published bimonthly by the 
American Psychological Association 
Prince and Lemon Sts., Lancaster, Pa. 
and 1333 Sixteenth Street N.W. 
Washington 6, D. C. 


$8.00 per vclume 


$1.50 per issue 


Artuur C. Horrman, Managing Editor; Heten Orr, Promotion Manager; Hersert Newt, Editorial Assistant 


Subscriptions, orders, and business communications should be addressed to the American Psychological Association, 
1333 Sixteenth St. N.W., Washington 6, D. C. Address changes must reach the. subscription office by the 10th of 


the month to take effect the following month. 


Undelivered copies resulting from address changes will not be replaced; 
subscribers should notify the post office that they will guarantee second-class forwarding postage. 


Other claims for 


undelivered copies must be made within four months of publication. 


Second class postage paid at Lancaster, Pennsylvania and at additional mailing places. 
© 1959 by the American Psychological Association, Inc. 





Journal of Applied Psychology 








VoL. 43, No. 3 


JuNE, 1959 








THE EFFECTS OF TOP AND MIDDLE MANAGEMENT 
SETS ON THE GHISELLI SELF-DESCRIPTION 
INVENTORY 


ROGER A. KAUFMAN, KARL L. HAKMILLER,! anp LYMAN W. PORTER 


University of California 


A recent article by Porter and Ghiselli 
(1957) explored the differences in self-percep- 
tions between top management and middle 
management personnel employed in a wide 
variety of industries. The instrument used 
to gather the data was a self-description in- 
ventory (SDI) developed by Ghiselli (1954) ; 
21 of the 64 forced-choice items on the inven- 
tory differentiated between the two levels of 
management. Top management individuals, 
when compared to those of middle manage- 
ment, pictured themselves as more active, self- 
reliant, and enterprising. The 21 items that 
differentiated between the two groups were 
later given scale values and used by Ghiselli 
and Lodahl (1958) in a small-group experi- 
ment. The scale derived from these items 
was labeled a Decision-Making Approach 
(DMA) scale, since the items seemed to de- 
scribe primarily differences in the ways the 
two management groups attack problems. 

The raw data for the DMA scale of the 
SDI were based on the self-perception descrip- 
tions made by top and middle management 
individuals in situations in which they had no 
knowledge of the ultimate use of their answers 
to the items on the inventory; in other words, 
none of these management individuals knew 
that they were contributing data to a study 
of differences between top and middle man- 
agement personnel. Therefore, although some 
individuals were undoubtedly trying to make 
themselves look as favorable as possible on 
the scale, or were operating under some spe- 
cific self-induced ‘“‘set,”’ there was no general 


1 Now at the University of Minnesota. 


“top management” set operating for all of 
the highest level personnel, or “middle man- 
agement” set for individuals in the lower 
level positions. 

The over-all purpose of the present study 
was to determine the effect of specific “top” 
and “middle” management sets on the DMA 
scale of the SDI and on the other scales 
developed by Ghiselli in connection with the 
SDI. Ideally, the Ss for such a study should 
be people holding middle and top manage- 
ment positions in business and industry. Be- 
cause of the difficulties in gathering data from 
a large enough sample of management per- 
sonnel at those levels, it was decided to use 
a more available group—college males. Since 
the primary purpose of this study was 
not to ascertain the differences in opinions 
between top and middle management indi- 
viduals, but rather to determine the effect of 
top and middle management sets on the SDI, 
it was felt that individuals who were reason- 
ably familiar with the business world either 
through experience, education, or business 
contacts would be able to respond appropri- 
ately to the specific sets. 

The following specific effects of top and 
middie management sets were investigated: 
(a) a change in level of scores on the DMA 
scale of the SDI, when Ss are operating under 
a top management set in contrast with oper- 
ating under a middle management set; (4) the 
degree and direction of the correlation be- 
tween top and middle management sets on 
the DMA scale; (c) the spread of influence 
of top and middle management sets to the 





150 


other five scales. in addition to the DMA 
which have been developed for use with * -e 
SDI; these other scales include ones for Ini- 
tiative, Intelligence, Occupational Level, Self- 
Assurance and Supervisory Qualities. 

The above paragraph presents the primary 
effects that were investigated. One additional, 
but secondary, effect was also studied. A 
variation in set was introduced for each man- 
agement level in order to compare descrip- 
tions when Ss tried to place themselves in 
the roles, with descriptions when they tried 
to picture how others actually fill the roles. 
Thus, one top management set was couched 
in terms “if you were a top management 
man,” and another top management set used 
terms that in effect asked the S “how would 
a top management man” (fill out the inven- 
tory). Identical variations 
the middle management level. 


were used for 


Method 
Subjects 


The Ss were 44 male undergraduates en- 
rolled in psychology courses, chiefly upper 
division classes in industrial psychology. 


Procedure 


The SDI was administered to the Ss by presenting 
a booklet containing a general instruction sheet and 
specific instruction sheets to be used with the SDI 
when filled out under each of five “sets.” The first 
page of the booklet told the S: “In this booklet you 
will find several copies of a Self Description Inven- 
tory for you to fill out. The sheet before each copy 
of the inventory has instructions for you to follow 
in filling out that copy. Execute each inventory in 
the order in which they are arranged. Be sure to 
follow the instructions on the sheets in front of 
each inventory.” 

The instructions for each of the five sets used in 
the study were as follows (the labels being omitted 
on the instruction sheets) 

Self set: “The purpose of this inventory is to 
obtain a picture of the traits you believe you possess 
and to see how you describe yourself. There are no 
right or wrong answers, so try to describe yourself 
as accutately and honestly as you can.” 

“Would Top” set: “Fill out this next inventory 
as you think a top management man would fill 
it out.” 

“If Top” set: “Fill out this next inventory as you 
would if you were a top management man.” 

“Would Middle” set: “Fill out this next inventory 
as you think a middle management man would fill 
it out.” 


R. A. Kaufman, K. L. Hakmiller, and L. W. Porter 


“If Middle” set: “Fill out this next inventory as 
you would if you were a middle management man.” 

The Self set was always administered first, in order 
to avoid any possible contamination from the other 
sets. The order of the four management sets was 
counterbalanced to control for order effects among 
these sets. No definition of top or of middle man- 
agement was given to the Ss, each S being left free 
to make his own interpretations of the terms. This 
was done in order to avoid over-attention by the 
Ss to a particular definition of the terms; it was 
felt that the terms were sufficiently well known to 
all Ss. 

Each inventory filled out under each of the five 
sets was scored for the six scales developed from the 
SDI: Decision-Making Approach, Supervisory Quali- 
ties, Occupational Level, Intelligence, Initiative, and 
Self-Assurance. 


Results and Discussion 


Table 1 presents the means and standard 
deviations for each scale on each of the five 
sets. Since each S$ completed the SDI under 
all five sets, and each SDI was scored on all 
six scales, the means in Table 1 are based 
on an N of 44. Table 1 also indicates the 


t-test results for differences between the means 
of the Self set and each of the management 
sets on each scale. 

Table 2 presents the results of analyses of 
variance performed on each scale, using the 


data from the four management sets. Table 
3 shows the correlations among the Self, If 
Top, and If Middle sets for each of the six 
scales. 

Tables 1 and 2 present the necessary data 
for evaluating the first effect being investi- 
gated in this study. The means for the four 
management sets presented in Table 1 and 
the analysis of variance for the DMA scale 
using these means presented in Table 2 dem- 
onstrate that the two top management sets 
produced significantly higher mean scores on 
the DMA scale compared with the two middle 
management sets. Thus, it appears that indi- 
viduals operating under a top management set 
at the time of taking the SDI can obtain 
significantly higher scores than if they are 
operating under a middle management set. 
Also, the tests of significance given in Table 
1 indicate that for these Ss, both of the top 
and one of the middle management sets cause 
significant increases on DMA scores in com- 
parison with a self set. 


The second effect that was evaluated in 





Ghiselli Self-Description Inventory 


Table 1 


Means and Standard Deviations 


SDI Scales 


; DMA 
Initiative 
Intelligence 
Occupational Level 


Self-Assurance 


Supervisory Qualities X 
o 


* Significant from Self mean at .05 level 
** Significant from Self mean at .01 level 


the data was whether there was a significant 
correlation between top and middle manage- 
ment sets on the DMA scale. Since the 
analysis of variance performed on the man- 
agement sets for the DMA scale in Table 2 
showed no significant differences between the 
“If” and “Would” sets, the If Top and If 
Middle sets were used to represent the top 
and middle management sets for the correla- 
tions presented in Table 3. Table 3 shows 
that there is a significant positive correlation 
of .501 between individuals’ scores on the 
DMA scale when the inventory is taken under 
a top set and the DMA scores for these indi- 
viduals when operating under a middle man- 


Sets 
“Would 


“it “Would 
Top” Middle” 


Middle” 


20.80** 
4.75 


21.41** 
5.00 


19.00 
4.70 


19.75* 


4.11 


36.09** 35.98** 


5.96 


35.14** 


32.35"" 
7.14 6.56 


1 


44.36* 
6.20 


39.7: 39.98 
7.33 7.38 

39.84** 
8.32 


41.36** 
9.51 
29.82** 29.59** 
5.64 4.92 
30.93* 


5.97 


26.64 
5.66 


agement set. This positive correlation indi- 
cates that an individual’s position on the 
DMA relative to the positions of other indi- 
viduais tends not to be affected by the par- 
ticular set operating for all individuals. That 
is, those individuals who scored high on the 
DMA under a top management set also 
tended to score high on the DMA under a 
middle management set, and likewise those 
who scored relatively low under one set were 
also low under the other set. 

Table 1 and Table 2 provide data to test 
whether the top and middle management sets 
affect other scales in addition to the DMA 
scale. Krug (1958) has recently demon- 


Table 2 


Analyses of Variance of Four Management Sets on Each SDI Scale 


Initiative 


Source US F MS 
131.28 
20.46 
38.88 


182.09 794.75 
56.31 6.56 
64.83 7 107 


Memt. Level 
Would-If 
Subjects 43 
Mgmt. Level 

X Would-lIf 1 
Residual 129 


85 


0.18 
14.67 


62.62 
36.18 


0.82 
32.96 


* Significant at .01 level 


Intelligence ave 


Self 
Assurance 


Occupational Supervisory 


Qualities 


I MS MS I MS 


712.02 16.09* 
34.57 


151.82 


719.63 
3.38 
81.86 


26.86* 
3,43* 3.06* 


17.82 
44.24 


3.21 
26.79 





R. A. 


Kaufman, K. L. Hakmiller, and L. W. Porter 


Table 3 


Correlations Between Sets Within Scales 


Self 
= Top” 


Scale 


DMA 

Initiative 
Intelligence 
Occupational level 
Self-assurance 
Supervisory Qualities 


* Significant at <.05 
** Significant at <.01 


strated that if Ss are given specific sets in 
accordance with some of the SDI scales, the 
effect of a set will be to raise the score on 
that scale and to generalize to some of the 
other scales in the inventory. As can be seen 
by the means in Table 1 and the analyses of 
variance in Table 2, Ss scored significantly 
higher on all of the other scales when filling 
out the SDI with top management sets. Also, 
Table 1 shows that on three of the six scales 
—TInitiative, Occupational Level, and Self- 
Assurance—all management caused a 
significant increase in scores compared to 
the Self set. On the DMA scale three of the 
four sets produced significant increases; only 
on the Intelligence and Supervisory scales did 
a majority of the sets fail to increase signifi- 
cantly the scores. In only 4 of 24 instances 
did the Self set produce higher means, and 
none of these was significantly higher. These 
findings, in general, confirm those of Krug 
regarding the spread of a particular set’s in- 
fluence. In the present study, a top manage- 
ment set produces greater spread than does 
a middle management set. 

As previously mentioned, the data pre- 
sented in Table 2 show that the Ss in this 
study did not give different results between 
sets that attempted to get at how they would 
picture themselves in the management posi- 
tions (the “If” sets), and how they perceive 
others in these positions (the “Would” sets). 
The particular college males contained in our 
sample apparently feel they would fulfill the 
roles about as they preceive others fulfilling 
the roles. 


sets 


Taken as a whole, the results of this experi- 


413** ae 
.359* 
484** 
.436** 
.461** 
.399** 


Self / 
f Middle” 


“If Top” 
“Tf Middle” 


501** 
aan" 173 

418** 388* 
cg 444** 
441** .629** 
.294* .549** 


ment indicate that a top management set can 
raise scores on all scales of the SDI when 
compared with a middle management set or 
a self set. However, the positive correlations 
between scores obtained under different sets 
also show that a top or middle management 
set does not greatly alter the relative posi- 
tions of individuals on the DMA or other 
scales of the SDI (except, possibly, the Ini- 
tiative scale). This indicates, for example, 
that if an individual assumes a top manage- 
ment set while others assume a self set, he 
can raise his scores on the various scales rela- 
tive to the others’ scores on the scales; if, 
however, the others used a top set instead of 
a self set, the individual’s relative position 
would not be greatly different from what it 
would have been if everyone had used a self 
set. The implications of these results for 
organizations using the SDI as a selection 
device would seem to be that for the type 
of sets used in this study the predictions for 
a given individual will not be strongly affected 
by the particular set he uses as long as others 
do not use entirely different sets. In situa- 
tions where predictions are desired for spe- 
cific individuals (as contrasted to those situa- 
tions where composite descriptions of large 
groups are desired), it would secm to be 
important to try to induce a uniform set for 
all individuals. Any uniformity that could 
be produced should be at leaSt as important 
as the nature of the particular set. 


Summary 


Forty-four male undergraduates drawn 
chiefly from courses in industrial psychology 





Ghiselli Self-Description Inventory 


were given the Ghiselli Self-Description In- 
ventory to execute under five different “sets”: 
a Self set, two top management sets, and two 
middle management Each inventory 
given under each set was scored on the six 
scales of the SDI; emphasis, however, was 
given to the results for the Decision-Making- 
Approach scale which had been derived previ- 
ously from self descriptions of top and mid- 
dle management personnel. The major re- 
sults of the present study were: 


sets. 


1. Top management sets produced signifi- 
cantly higher mean scores on the DMA scale 
than did middie management sets. 

2. There was a significant positive correla- 
tion between top and middle management 
sets on the DMA scale. 


153 


3. Subjects tended to score significantly 
higher on the other SDI scales when operat- 
ing under top management sets. 


Received August 25, 1958 


References 


Ghiselli, E. E. The forced-choice technique in self 
description. Personnel Psychol., 1954, 7, 201-208 

Ghiselli, E. E., & Lodahl, T. M. Patterns of mana- 
gerial traits and group effectiveness. J 
soc. Psychol., 1958, 57, 61-66 

Krug, R. E 
forced-choice self-description inventory. J 
Psychol., 1958, 42, 89-92 

Porter, L. W., & Ghiselli, E. E. The self percep- 
tions of top and middle management 
Personnel Psychol., 1957, 10, 397-406 


abnorm 


The effect of specific selection sets on a 


appl 


personnel 





Journal of Applied 


Psychology 
Vol. 43, No. 3, 1959 


EFFECTS OF MASSED PRACTICE AND THICKNESS 
OF HANDCOVERINGS ON MANIPULATION 
WITH GLOVES 


HILDE GROTH ann JOHN LYMAN 


University of California, Los Angeles 


Prior work in this laboratory has indicated 
a consistent relationship between the amount 
of prehension force applied to a manipulated 
object and the coefficient of friction between 
the object and the handcovering (Groth & Ly- 
man, 1958; Zweizig & Lyman, 1957). How- 
ever, results of these experiments did not de- 
lineate performance changes which might be 
induced by thickness of handcovering mate- 
rials or by massed practice. The present ex- 
periment accordingly was carried out to ex- 
amine the specific effects of these variables. 

A search of the literature provided only 
scanty information concerning the effects of 
practice upon gloved performance. Teichner 
and collaborators (Teichner, Kobrick, & 
Dusek, 1954) reported improvement of speed 
on the Minnesota Rate of Manipulation Test 
for three types of gloves without finding such 
a trend for the bare hand. Their results seem 
“., , to support the notion that with suffi- 
cient practice difficult glove tasks become con- 
siderably reduced in difficulty to the point 
where relative impairment might possibly be 
very small” (Teichner et al., p. 24). Evi- 
dence for improvement of skill on two other 
manipulatory tasks while wearing gloves has 
also been found by Blair and Gottschalk 
(1947). In this case, the practice sessions 
were distributed over several days and the au- 
thors found improvement of speed for bare 
hand performance as well as for the glove 
condition. This result makes it difficult to 
differentiate between some over-all learning 
effect and that due to practice with gloves. 

Since our previous investigations have indi- 
cated a fair relationship between task stress 
and prehension force levels we expected to 


1 This investigation was supported by QM Con- 
tract No. DA 44-109-9M-1531 between the U. S 
Army QM Corps and the University of California, 
Los Angeles. The opinions expressed are those of the 
authors and do not necessarily reflect those of the 
contracting agency. 


154 


find a gradual increase in applied force from 
start to finish cf a massed trial (Groth, 1957). 
However, studies by other investigators (Hov- 
land, 1951; Telford & Swenson, 1942) of 
variations of tension in the muscles specific 
to a task have shown little agreement as to 
predictability of tension increases or decreases 
attributable to practice during motor skill 
learning. Therefore, our primary interest in 
this investigation was directed toward evalua- 
tion of performance changes during the course 
of a prolonged trial as well as changes in 
absolute level when comparing over-all per- 
formance on short trials with those of longer 
duration. Simultaneously, we explored the 
effects of thickness of selected handcovering 
materials on three criterion measures of ma- 
nipulatory skill. 


Procedure 


Subjects. Twenty-four male undergraduate engi- 
neering students served as experimental subjects. 

Apparatus and task. The electronically-controlled 
display and control matrix and the cylindrical pre- 
hension force transducer have been described in de- 
tail elsewhere (Groth, 1957). In a self-paced task, 
the Ss were asked to place the cylinder into a recess 
in the control board which corresponded in location 
to a light in the display matrix. Cylinder placement 
completed an electrical contact which switched the 
display light to another location. This lighting se- 
quence appeared random to the Ss. 

The following measurements were recorded at 
three-minute intervals: (a) the time integral of pre- 
hension force, (6) the sum of the transport times, 
(c) the sum of the cylinder transports 

Changes in surface friction and thickness of glove 
material were controlled by the choice of handcover- 
ings with the following characteristics: 


, thickness 0 
, thickness 0.25 mm 
, thickness 0.75 mm 


bare hands, w= 
knit cotton gloves, yu= 
army leather gloves, u 
army arctic mittens 
with fleece liner, 


73 
25 
55 


u = .55, thickness 3.50 mm 
The coefficients of surface friction between the 

handcovering and aluminum were determined by the 

drag method described previously (Groth & Lyman, 





Manipulation with Gloves 


Table 1 


Summary of F-values of Analyses of Variance for Practice Effects 


Prehension Force 


Right F Left 


Handcovering 
Arctic Mitten 
Cotton Glove 30 
Leather Glove 3.14* 
Bare Hand 39 


04 
1.42 
56 


A3 


04 


*P< 0S. 


1958). Thickness of the uncompressed gloves was 
measured with calipers. 

Contact area during manipulation was controlled 
by instructions for grasping the cylinder. 

Routine. The experiment was conducted during 
October, 1957. Room temperatures ranged from 25° C 
to 30° C. The Ss were assigned randomly to four 
experimental groups, each S$ performing with one 
type of handcovering only. After familiarization, 
the task was administered to both the right and the 
left hands while in a standing posture. The se 
quence of right-left administration was counterbal 
anced within each experimental group. Each S came 
for one experimental session, performing 30 min 
with each hand. The two runs were separated by a 
5 min. rest pause during which the Ss were required 
to sit down. 

Subjects performing with the bare hand dried their 
hands thoroughly with a turkish towel before the 
run. Perspiration and any consequent changes in 
surface friction could not be controlled during bare 
hand performance 

Calculations. 1. The effects of massed practice on 
prehension force, number of transports d time per 
transport were assessed by analysis variance 
Comparisons were made for the readings taken at 
6, 12, 18, 24, and 30 min. of performance. 

2. The effects of handcoverings on prehension 
force, number of transports, and time per transport 
were assessed by analysis of variance. Comparisons 
were made among the mean values for the full 30- 
min. period. 

Differences between individual treatments were 
evaluated by Duncan’s (1955) Multiple Range Test 
when appropriate. 

The significance level was set at P < .05 


Results 


Effects of practice on gloved performance. 
Criterion measures: (a) Mean prehension 


* The statistical tables have: been deposited with 
the American Documentation Institute. Order Docu- 
ment No. 5886 from A.D.I. Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress, Washington 25, D. C., remitting in advance 
$1.75 for microfilm or $2.50 for photocopies. Make 


Number of Transports 


Right F 


Time per Transport 


Left Right F Left 
AN 
13 


AS 
30 


10 
08 
26 
1.36 


Al 14 
07 1.43 
74 63 


1.03 83 


force (PF), obtained by dividing the integral 
of force by total transport time. (5) Total 
number of transports during the 30-minute 
test trials. (c) Time per transport, obtained 
by dividing total transport time by total num- 
ber of transports. 


— 


| COTTON 


BARE HAND 


Pm 


GLOVE x 
yo LEATHER GLOVE 


1°) 
ARCTIC MITTEN 





re) 4 l L 4 j 
° 2 4 6 8 1.0 





- MEAN SECONDS PER TRANPORT 


COEFFICIENT OF FRICTION 


COTTON GLOVE 
x 


O 
ARCTIC MITTEN 


BARE HAND 
x 


x 
LEATHER GLOVE 





90 1 1 1 1 j 
° 7 4 6 A -) 1.0 


COEFFICIENT OF FRICTION 





MEAN NUMBER OF TRANSPORTS 


Fic. 1. Mean performance and regression lines for 


right-hand performance. 


checks payable to Chief, Photoduplication Service, 
Library of Congress 





Hilde Groth and John Lyman 


Table 2 


Mean Prehension Force and Variability 


( % reff. of 
Thickness Friction 


Conditions (mm.) (u) 


0.73 
0.25 


0.55 


Bare Hand 
Cotton Glove 
Leather Glove 


Arctic Mitten 0.55 


The statistical analyses of performance 
changes as a function of time failed to reach 
significance with one exception, PF for right- 
hand performance with the leather glove. 
Table 1 reports the summary of the F values 
of the analyses of variance. However, in- 
spection of the plotted data indicated that 
these fell into two distinct groups for all 
handcovering conditions: “poor” performance 
and “good” performance. For prehension 
force “poor” performance was obtained with 
arctic mittens and cotton gloves, and “good” 
performance with bare hands and _ leather 
gloves. The diametrically reversed conditions 
were found for number of transports and time 
per transport. Regression lines were deter- 


Left Hand 


Right Hand 


X (gms.) s (gms.) X (gms.) s (gms.) 


1214 137 
2140 129 
2018 128 


2535 205 


1102 196 
2674 829 
1514 216 
3354 385 


mined graphically for these two performance 
groups (Askovitz: 1955a, 1955b). These are 
shown for the right hand in Fig. 1. 

Little change in prehension force during the 
trial can be detected from the regression lines. 
Furthermore, practice seems to exert opposite 
effects upon the two groups. 

Regression lines for the number of trans- 
ports and the time per transport show a slight 
but consistent trend of performance facilita- 
tion from the beginning to the end of the prac- 
tice session for both groups. 

Effects of handcoverings on performance. 
Criterion measures: (a) mean _ prehension 
force, (b) total number of transports, (c) 
time per transport. The grand means and 


Table 3 


Mean Time per Transport and Variability 


Coeff. of 
Thickness Friction 


Conditions (mm.) (p) 


0.73 
0.25 
0.55 


0.55 


Bare Hand 

Cotton Glove 
Leather Glove 
Arctic Mitten 


Right Hand Left Hand 


X (sec.) s (sec.) X (sec.) s (sec.) 


0.89 
0.65 
0.77 


0.55 


0.32 
0.34 
0.17 
0.41 


0.84 
0.67 
0.75 


0.55 


Table 4 


Mean Number of Transports and Variability 


( ‘oeff. ol 
Friction 


Thickness 


Conditions mm.) (u) 


0.73 
0.25 
0.55 


0.55 


Bare Hand 

Cotton Glove 
Leather Glove 
Arctic Mitten 


Right Hand Left Hand 


X x 





Manipulation with Gloves 


Table 5 


Analysis of Variance for Prehension Force 


Source of 
Variance 
Right Hand 


Treatments 
Within 


Total 


Left Hand 
Treatments 
Within 
Total 


*P < .05 


their respective variabilities were taken as the 
basic data for the statistical analyses in order 
to obtain greater stability by reducing the 
variability. These data are reported in Tables 
2, 3, and 4. 

Statistical significance was not obtained for 
all analyses but the graphical representation 
indicates a consistent relationship of the hand- 
coverings to the coefficient of friction. These 
analyses are reported in Tables 5, 6, 7, 8, 9. 
Number of transports and time per transport 
show some performance facilitation with a de- 
crease in the coefficient of friction whereas 
prehension force shows an impairment of per- 
formance with such a decrease. Performance 
with the fleece-lined arctic mittens corre- 
sponds to performance of a thin handcover- 
ing with a very low coefficient of friction for 
all three criterion measures. Figures 2 and 3 
present these results graphically. 


Discussion 

The effects of massed practice obtained in 
this study fall into two categories: those per- 
taining to performance changes during the 30 
minutes of performance and changes of over- 
all performance level when compared to stud- 
ies of shorter trial duration. 

Prehension force showed no evidence of a 
trend during the 30 minutes. These results 
with Telford 
later trials during mirror 
Perceptually, our task was 


to be consistent 
(1942) 
tracing practice. 


appear and 


Swenson’s 


Mean 
Squares 


Sum of 
Squares 


19,365,000 
26,802,000 


6,455,000 
1,340,000 


46,167,000 


5,532,000 
12,318,000 


1,844,000 
616,000 


17,851,000 


much simpler than mirror tracing which 
might account for the absence of an increase 
in muscular tension toward the end of the 
trial. 

Performance speed and total output showed 
a trend indicating slight performance facilita- 
tion with practice. However, this trend was 
obtained for gloved manipulation as well as 
for the bare hand condition. Since prior work 
with the same apparatus had never shown any 
practice effect for bare hands when trials were 
separated by several days, we felt reasonably 
assured that no such changes would occur 
during the course of a single prolonged trial 
(Groth, 1957). 
that we encountered a similar problem to that 


Our results show, however, 


Table 6 


Duncan’s Multiple Range Test for Difference in 
PF Between Any Two Treatments for 
Right-Hand Performance 


Shortest 
Significant 
Ranges 
P = 5% 


Obtained 
Ranges 


Compari 
Conditions 


sons 


Bare Hand \ D-A 1049 

Leather Glove B D-B 1021 

Cotton Glove C D-C 971 680 

Arctic Mitten D 1021 1572* 
971 1160* 
971 412 





Hilde Groth and John Lyman 


Table 7 


Analysis of Variance for Number of Transports 


Source of 
Variance 
Right Hand 
Treatments 
Within 
Total 


Left Hand 
Treatments 
Within 


Total 


of Blair and Gottschalk in their study and we 
cannot, therefore, attribute a certain amount 
of performance facilitation to “learning how 
to perform with gloves.” 

When comparing the results for the over- 
all trial length with such scores taken previ- 
ously on trials of shorter duration, we found 
several interesting changes (Groth & Lyman, 
1958). Prehension force as a function of sur- 
face friction showed the same trend as before 
but a considerable increase in absolute force 
level was recorded (Fig. 2). An explanatory 
hypothesis for this finding probably should be 
sought in terms of an “Einstellungseffekt” re- 


sponsible for raising the general tension level 


because of knowledge of the prolonged trial 
period. 


Mean 
Squares 


Sum of 
Squares 


2,800 
17,350 


20,150 


6,380 
14,720 


21,100 


Speed and total work output both increased 
with a decrease in surface friction in this 
study. A possible post hoc explanation might 
be suggested in terms of “overcompensation.” 
The S might try to offset these nonoptimal 
conditions by “working faster.” However, 
both of these hypotheses need empirical vali- 
dation. Our prior investigation failed to show 
any predictable relationship between speed 
and surface friction and indicated a slight 
decrease in total output for conditions of ex- 
tremely low surface friction associated with 
a thin coating of a silicone grease on the sur- 
face of the fingers and gloves. 

Thickness of handcovering material became 
important for the extreme condition, namely 


the arctic mittens. In terms of changes in 


Table 8 


Analysis of Variance for Time per Transport 


Source of 
Variance 


Right Hand 


Treatments 
Within 


Total 


Left Hand 


Treatments 


Within 


Total 


Mean 
Squares 


Sum of 
Squares 





Manipulation with Gloves 


quality of performance, the mittens were 
equivalent to a thin handcovering with a very 
low coefficient of friction. This held true for 
all three measures. 

The results of this investigation supported 
our earlier findings demonstrating the impor- 
tance of characteristics of surface friction and 
bulkiness of material for the design of pro- 
tective handcoverings. We did not find any 
evidence for the assumption that effort can be 
reduced by practice on the type of manipula- 
tion task we used. We would like to em- 
phasize the relatively low skill level required 
by our task, however, and point out that this 
may have been a large factor in our results. 

This problem of task specificity has been 
brought out very clearly in a report by Brad- 
ley (1957) who investigated certain glove 
characteristics on control manipulability. He 
found a considerable interaction of perform- 
ance with a large variety of gloves and the 
type of control operation required. 


i. 4h 


Table 9 


Duncan’s Multiple Range Test for Difference in Time 
per Transport Between Any Two Treatments 
for Left-Hand Performance 


Shortest 
Significant 
Ranges 
P = 5¢ 0 


Compari 
sons 


Obtained 


Conditions Ranges 


Bare Hand A 
Leather Glove B 
Cotton Glove C 
Arctic Mitten D 


A-D 
A-C 
A-B 
B-D 
B-C 
C-D 


318 
310 
295 
310 
.295 
295 


34* 
.24 


1 
2 
1 
1 


0 


Summary 
This study was designed to evaluate the im- 
portance of surface friction and thickness of 
handcovering materials during prolonged ma- 
nipulatory performance. 





iJ 











7 





ARCTIC MITTEN | 








4. 
| 


4 


ni | 
1] 














+ 


' 
COTTON GLOVE 


a 5 | 
=o 
O=-==O"UUST SLiP" 
CURVE 

TRIAL DUR- 


pf 
| | 
| 


al — 
] 

















e—--—~ 








jeeaTnen, “eiove 





‘JUST SLIP 
~CURVE 


ATION 3 
MINUTES 
TRIAL DUR-| | 
ATION 30 





ILICONE 
in 





MINUTES 
i 


T 






































MEAN PREHENSION FORCE (GMS) 





ial auconet. 




































































7 


8 9 10 t1 12 13 14 «15 16 


COEFFICIENT OF FRICTION (yu) 


Fic. 2. 


Relation of coefficient of friction to mean prehension force. 


(Right-hand performance.) 





Hilde Groth and John Lyman 


n ow 
a fo} 
fo} ° 
°o ° 


oS a 
o oO 
°o °o 
MEAN NUMBER OF TRANSPORTS 


O~ 


MEAN PREHENSION FORCE — GRAMS 








a 





ARCTIC MITTENS 
COTTON GLOVES 
LEATHER GLOVES 
BARE HANDS 
GRAND MEANS 
REGRESSION LINES 


a ae ee | 


MEAN SECONDS PER TRANSPORT 








6 
TIME 





12 ie 24 30 
IN MINUTES 


. 3. Relation of coefficient of friction to time per transport and number of 
transports. (Right-hand performance.) 


Surface friction and thickness of material 
were controlled by the following experimental 
conditions: 


bare hands, » = .73, thickness = 0 

knit cotton gloves, » = .25, thickness 
0.25 mm. 

army leather gloves, » = .55, thickness - 
0.75 mm. 

army arctic mittens, fleece line, » = .55, 
thickness = 3.50 mm. 


Manipulatory skill was evaluated by three 
criterion measures: mean prehension force, 
total number of transports, and mean time 
per transport. The measures were taken at 
three-minute intervals. 

Twenty-four male Ss performed a simple 
manipulation task of 30 minutes’ duration. 
The Ss were randomly divided into four 
groups of six Ss each. Each group performed 
with one type of handcovering only. 

Analysis of the results failed to show a time 
trend for prehension force, but regression lines 
for the number of transports and for time per 
transport indicated a slight but consistent im- 
provement of performance throughout the 30 
minutes. The four experimental conditions 
fell into two fairly distinct performance 
groups which could be classified into “poor” 
performance and “good” performance. 

Graphical comparison of mean prehension 
force on the prolonged trial and on the short 


trials of our previous investigation showed a 
considerable increase in absolute level for the 
30 minute run. 

All three criterion measures were directly 
affected by change in surface friction, and to 
a lesser extent by thickness of the material. 
Performance with the arctic mittens corre- 
sponded to performance with a thin hand- 
covering with a very low coefficient of friction. 

The results were discussed in relation to 
other studies and to practical implications for 
the design of protective handgear. 


Received July 7, 1958 


References 


Askovitz, S. I. Rapid method for determining mean 
values and areas graphically. Science, 1955, 121, 
212. (a) 

Askovitz, S. I. Mean rates of change and least 
squares—interpretations and rapid graphic meth- 
ods. J. appl. Psychol., 1955, 8, 347-352. (b) 

Blair, E. A., & Gottschalk, C. W. Efficiency of signal 
corps operators in extreme cold. U.S. Army Med. 
Res. Lab. Rep., 1947, Rep. No. 2. 

Bradley, J. V. Glove characteristics influencing con- 
trol manipulability. USAF WADC Tech. Rep., 
1957, No. 57-389. 

Duncan, D. B. Multiple range and multiple F tests 
Biometrics, 1955, 11, 1-42. 

Groth, Hilde. An experimental assessment of pre- 
hension force as a measurement of effort in psy- 
chomotor skills. Unpublished doctoral disserta- 
tion, Univer. California, Los Angeles, 1957. 

Groth, Hilde, & Lyman, J. Effects of surface fric 
tion on skilied performance with bare and gloved 
hands. J. appl. Psychol., 1958, 42, 273-277. 





Manipulation with Gloves 161 


Hovland, C. I. Human learning and retention. In 
S. S. Stevens (Ed.), Handbook of experimental 
psychology. New York: Wiley, 1951. 

Lindquist, E. F. Design and analysis of experiments 
in psychology and education. Boston: Houghton 
Mifflin, 1953. 

McNemar, Q. 
Wiley, 1949. 

Teichner, W. H., Kobrick, J. L., 
Studies of manual dexterity: I 


Psychological statistics. New York: 


& Dusek, E. R. 
Methodological 


studies. Natick, Mass.: QM. Res. & Development 
Center, 1954. (Rep. No. EP-3.) 
Telford, C. W., & Swenson, W. J. 
muscular tension during learning. J 
chol., 1942, 30, 236-246 
Walker, H. M., & Lev, J 
New York: Holt, 1953 
Zweizig, J. R., & Lyman, J. The effect of laminar 
configurations of handcovering materials on mean 
prehension force. Los Angeles: Dept. of Engineer- 
ing, Univer. California, 1957. (Rep. No. 57-24.) 


Changes in 
exp. Psy- 


Statistical inference 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


AVA VALIDITY FOR TEXTILE WORKERS ‘* 


PETER F. MERENDA ano WALTER V. CLARKE 


Walter V. Clarke Associates, Inc. 


The Activity Vector Analysis (AVA) is a 
self concept personality assessment instru- 
ment. It is widely used in industry in the 
classification and selection of personnel at all 
levels of employment. The details of the con- 
struction and application of the AVA have 
been published by Clarke (1956a; 1956b; 
1956c; 1956d). Reliability and validity stud- 
ies on this instrument have been reported 
by Bennett (1957), Clarke (1956c), Hammer 
(1958), Lundin (1957), Merenda (1958), 
Musiker (1958), and Whisler (1957). Most 
of this earlier research on the validity of 
AVA, however, has either been devoted to va- 
lidity in terms of personality description or to 
the problem of classification of personnel on 
the basis of AVA and criterion data derived 
from concurrent samples. The present study 
deals with the validity of AVA (over time) as 
a predictor of on-the-job performance and 
job success, within one company, of first line 
workers in the textile industry. 


Subjects 

Subjects were all (NV = 142) first line work- 
ers mainly at semiskilled and unskilled levels, 
who possessed at least a sixth grade education 
and who were hired by a large southeastern 
textile concern over the 15-month period be- 
tween January 1, 1957, and March 31, 1958. 
Of the 142 Ss, 107 are males and 35 are fe- 
males. Although a variety of jobs is repre- 
sented in this sample, the occupations cluster 
around the relatively low skill level and rou- 
tine operational tasks involved in the manu- 
facture of textile goods. The specific occupa- 
tions represented by this sample are slubber 
tender, doffer, spinner, yarn winder, card 
tender, humidifier man, creeler, oiler, trucker, 
and packer. 


1 The authors gratefully acknowledge the assistance 
and cooperation rendered by J. Vernon Wallace of 
the Bibb Manufacturing Company, Macon, Georgia, 
who supervised the collection of the data for this 
project. 


Criteria 

There were two criterion measures of em- 
ployee success used in the study. The first 
of these was a locally-prepared five-item rat- 
ing scale for measuring Job Proficiency. The 
individual components on this scale are: 1. 
Job Performance; 2. Attitude; 3. Coopera- 
tion; 4. Learning Ability; 5. Attendance and 
Promptness. Each item is scaled qualitatively 
from “very poor” to “very good” with an ac- 
companying numerical scale ranging from 0.0 
to 5.0. A composite score which is the un- 
weighted sum of these five components is ob- 
tained for the scale and yields a range of total 
scale scores from 0.0—25.0. 

The Job Proficiency Scale was subjected to 
internal consistency analysis. The technique 
used was one developed recently by Stanley 
(1957). It is a generalization of the well- 
known K-R Formula #20 applied to non- 
dichotomously scored items and is algebrai- 
cally equivalent to Cronbach’s (1951) Coeffi- 
cient Alpha and Hoyt’s (1941) analysis of 
variance formulas. 

Internal consistency coefficients for three 
raters of this study using the instrument at 
various time intervals are reported in Table 1. 
The ratings were made by the criterion raters 
at the end of 30 days and again at 90 days 
after first employment for those who were 
still on the payroll on those dates. For the 
attrition group the ratings were made on the 
severance date. 

It will be noted in Table 1 that the six in- 
ternal consistency coefficients reported therein 
are all substantially high. Hence, the scale 
appears to be relatively homogeneous and the 
use of the composite score as a criterion meas- 
ure is permissible. 


Raters 


There were two categories of raters for this 
study. The predictor raters were the com- 
pany personnel manager who was a trained 
interpreter (analyst) of the AVA and the as- 





AVA Validity for Textile Workers 


sistant personnel manager (interviewer) who 
was not trained to use the AVA. The cri- 
terion raters were the supervising foremen, re- 
spectively, of the Ss of this study. 

The 142 Ss of this study were divided 
among six foremen. However, separate analy- 
ses were made for only three of the raters, 
since these three supervised the great ma- 
jority (80%) of the sample. The remainder 
of the Ss was scattered among the other three 
foremen. Consequently, the m’s for these 
others were too small for practical purposes 
and, therefore, these latter were not analyzed 
separately. The analyses based on the total 
group include all cases under all six foremen. 


Procedure 


Beginning in January 1957 and continuing for a 
15-month period, every new applicant for each of 
the jobs listed previously in this article was hired 
solely on the basis of an interview by the “inter- 
viewer” of this study. The decision as to hire or 
not hire was made completely without any reference 
to the AVA. Because of the employment situation 
relative to the supply and demand of workers for 
the jobs of the study, especially during peak produc- 
tion periods, it was necessary to hire persons who 
did not meet the criteria for employment established 
by the interviewer 

Within a day or two after each of the Ss of this 
study was hired, he was administered an AVA by 
the interviewer who originally hired him. The AVA 
analyst was given the completed AVA forms. He 
then scored and interpreted the profiles of results 
He had no personal contact with the S prior to the 
administration and scoring of the AVA. 

Both the AVA analyst and interviewer predicted 
the expected job proficiency of these Ss using the 
rating scale just described. The ratings were inde- 
pendently made. The interviewer made his predic- 
tions solely on the basis of information gathered 
during the employment.interview. The AVA ana- 
lyst made his predictions: solely on the basis of in- 
formation revealed to him through his interpretation 
of the AVA. Hence, the predictions were actually 
made on the basis of “blind analyses” of AVA 

The AVA analyst, in addition to his knowledge of 
AVA theory and application, also was well ac- 
quainted with the nature of the occupations for 
which he was predicting and with the over-all com- 
pany philosophy regarding the ‘reatment of workers 
This additional knowledge, no doubt, lent consider- 
able assistance in his interpretation of AVA profiles 
for the purpose of predicting success or failure of the 
Ss of the study 

Predictions were made by the AVA analyst of job 
performance ratings for each of the 142 Ss. Unfor 
tunately, however, the interviewer did not begin his 
predictions until after the project had been under 


Table 1 


Coefficients of Equivalence for Three 
Independent Raters 


Time of Rating 


Criterion Rater 90 Days 


30 Days 


Foreman I 87 (57) 
Foreman IT .91 (38) 
Foreman III 87 (19) 


78 (36) 
94 (29) 
85 (13) 


Note.—Figures in parentheses are n’s upon which each co 
efficient is based 


way for several months. Hence, interviewer predic- 
tions are available on only a portion of the sample 

Comparisons were made, however, between the 49 
commonly rated personnel and the total sample with 
respect to sex distribution, age, educational level, and 
job performance ratings. No statistically significant 
differences were found. Hence, it may be concluded 
that the reduced sample on whom the interviewer 
made his predictions was representative of total 
sample. 

For those workers who had been on the job for 
30 days, their respective foremen were given the Job 
Proficiency Rating Scale and asked to rate them on 
each of the five items on the scale. For those work- 
ers, who for one reason or another did not survive 
the first month, the ratings were made on the last 
day of work with the company. The same process 
was repeated at the end of each 90-day period for 
all Ss surviving more than 30 days on the job 
Again, as previously, ratings were made on the last 
day of employment for those workers leaving prior 
to the expiration of 90 days on the job 

Comparisons were made between the AVA ana- 
lyst’s and interviewer’s predicted ratings with those 
made by the criterion raters. 


Results 


Product moment correlations were calcu- 
lated between predicted and actual rating 
scores for both the AVA analyst and inter- 
viewer. These statistics are presented in 
Table 2. Inspection of the data of this table 
reveals that for the pooled groups of all raters 
substantial correlations were found to exist 
between the predicted and criterion scores for 
both the AVA analyst and the interviewer. 
The over-all validity coefficients were found 
to be about equal for both the 30-day pre- 
dictions (r =°.50 for analyst, r = .41 for in- 
terviewer), and the 90-day predictions (r 
.58 for analyst, r = .33 for interviewer). The 
differences in the validity coefficients were not 
statistically significant. 





Peter F. Merenda and Walter V. Clarke 


Table 2 


Validity Coefficients for AVA Analyst and Interviewer at End of 30-Day and 90-Day Periods 


Predictor 
Rater 


30-Day 
Ratings 


AVA Analyst 


Interviewer 


48 (57) 
54 (22) 
90-Day 
Ratings 


AVA Analyst 


Interviewer 


59 (36) 
.66 (18) 


30- vs. 90-Day 
Ratings .67 (36) 


* Not significant. 
** Insufficient n. 


Criterion Rater 


II Ill 


Analyst vs. 


All Interviewer 


.24* (38) .13* (19) 
47* (16) vote 


50 (142) 
Al (49) 


59 (29) .09* (13) 


.25* (16) bia 


58 (85) 
.33 (37) 


88 (29) 03 (13) .58 (85) 


Note.—Figures in parentheses are n’s upon which each r is based. 


The Critical Ratio for the difference in r’s 
between the analyst’s and the interviewer’s 
predictions was 0.66 for the 30-day ratings 
and 1.57 for the 90-day ratings. 

The data seem to indicate that the blind 
use of AVA and the interview techniques both 
show substantial validity in terms of the cri- 
teria of this study. 

Correlations between predicted scores and 
criterion ratings by the individual raters, 
were, except for Rater I, on both 30- and 90- 
day predictions, and Rater II on 90-day pre- 
dictions, not statistically different from zero. 
The failure of these values to reach the ac- 
cepted limits of significance is undoubtedly 
due in a marked degree to the fact that they 
are based on relatively small sample sizes. 
Rater III showed to be inconsistent (r = .03) 
between the ratings he assigned to the same 
Ss at 30-day and 90-day intervals. This find- 
ing suggests relatively low reliability of this 
rater’s judgments and conceivably is another 
factor attenuating the validity coefficient for 
both predictor raters. On the other hand 
Rater I = .67) and Rater II (r= .88) 
were considerably more consistent in their 
two ratings. 


(r 


For all Raters the consistency 
of ratings made at 30 days and at 90 days 
was also relatively high (r = .58). The AVA 
analyst and interviewer agreed only to a 
moderate degree (r = .33) with respect to 
common ratings assigned to 49 Ss. 


Discussion 


The correlation coefficients between pre- 
dicted ratings and criterion scores of the com- 
bined groups on the job proficiency scale 
proved in this study to be positive and sta- 
tistically significant for both predictor raters. 
Those based upon a blind interpretation of 
AVA seem to be somewhat higher at the end 
of 90 days than those based upon personal 
interview. However, the difference proved not 
to be statistically significant. 

These findings attest to the predictive va- 
lidity of both procedures for the specific occu- 
pations studied and are suggestive of prob- 
able increased predictive efficiency of the 
combined use of AVA and personal interview. 
In the operational setting, the AVA is used 
as an adjunct to, and not a substitute for, the 
personal interview. The fact that these same 
correlations for individual raters show consid- 
erable variation from the combined raters’ 
correlations is probably due to variations in 
sample sizes and possibly the suspected un- 
reliability of the judgments of one criterion 
rater. Overall, however, the results of this 
study tend to show that equally good and 
possibly better long-range predictions of on- 
the-job success can be made through the 
skilled use of AVA as through personal inter- 
view procedures. 

The findings of this study confirm the re- 
sults of other recently completed 
(Lundin, 1957; Bennett, 1957) 


studies 
involving 





AVA Validity for Textile Workers 


blind analysis of AVA. These earlier studies 
were concerned with the validity of AVA in 
describing personality. The present study 
has investigated and reported on the problem 
of how the temperament characteristics, as 
measured by AVA, are associated with on-the- 
job success of workers in certain occupational 
areas. 


Summary and Conclusions 


Blind predictions as to the probable on-the- 
job success of applicants for routine machine 
operational as well as other semiskilled and 
unskilled jobs in a large textile concern were 
made solely on the basis of AVA profiles. 
These predictions were in the form of nu- 
merical ratings on a job proficiency scale. 

Subjects were 142 new hires for various first 
line worker jobs over a period of 15 months. 
Several foremen, supervisors of the Ss, were 
asked to rate them on the job proficiency 
scale 30 days and 90 days after employment. 
Comparisons of the AVA analyst’s predictions 
and raters’ judgments were then made. Pre- 
dictions made by the interviewer who actu- 
ally did the hiring were also compared with 
the criterion ratings on a portion of the total 
sample. 

Internal consistencies of the rating scale 
ranged from .78 to .94 with a median of .87. 
Product moment correlations between ana- 
lyst’s predictions and the criterion scores of 
the combined groups of raters were .50 for 
30 days and .58 for 90 days. For the inter- 
viewer the r’s were .41 for 30 days and .33 
for 90 days. 

On the basis of these findings it may be 
concluded that both the interview techniques 
and the AVA, when employed by a trained 
interpreter, are valid predictors of job success 
for the occupations studied. The data of the 


165 


study also suggest that the predictive effi- 
ciency may be enhanced by combining these 
two procedures in the selection of textile 
workers performing routine operational tasks. 


Received July 7,.1958. 


References 


Bennett, J., Jr., Musiker, H. R., & Clarke, W. V. 
Activity Vector Analysis vs. clinical appraisal in 
personality description. Unpublished manuscript. 
Providence, R. I.: Walter V. Clarke Ass., 1957. 

Clarke, W. V. Personality profiles of loan office 
managers. J. Psychol., 1956, 41, 405-412. (a) 

Clarke, W. V. Personality profiles of self-made com- 
pany presidents. J. Psychol., 1956, 41, 413-418. 
(b) 

Clarke, W. V 
lection | 
394. (¢ 
larke, W The personality profiles of life insur- 
ance agents. J. Psychol., 1956, 42, 295-302. (d) 

Cronbach, L. J. Coefficient alpha and the internal 
siructure of tests. Psychometrika, 1951, 16, 297- 
334. 

Hammer, C. H 
Vector Analysis. 
Univer., 1958. 

Hoyt, C. J. Test reliability estimated by analysis of 
variance. Psychometrika, 1941, 6, 153-160 

Lundin, \\. H. A clinical evaluation of the Activity 
Vector Analysis Test. Unpublished manuscript 
Chicago, Illinois: T. W. Franks and Associates, 
1957. 

Merenda, P. F., & Clarke, W. V. 
dictor of occupational hierarchy 
chol., 1958, 42, 289-292 

Musiker, H. R., & Clarke, W. \ 
reliability of Activity Vector 
Rep., 1958, 4, 435-438 

Stanley, J. C. K-R 20 as a stepped-up mean r 
among items. Paper read at annual convention of 
the National Council on Measurements Used in 
Education. Atlantic City, New Jersey, Feb., 1957. 
(Mimeo.) 

Whisler, L. W. A study of the descriptive validity 
of Activity Vector Analysis. J. Psychol., 1957, 43, 
205-223. 


ihe construction of an industrial se- 
onality test. J. Psychol., 1956, 41, 379 


A validation study of the Activity 
Unpublished dissertation, Purdue 


AVA as a pre- 
J. appl. Psy- 


The desc riptiv e 
Analysis. Psychol 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


AN INVESTIGATION OF SOME ASPECTS OF THE 
SOCIAL PSYCHOLOGICAL IMPACT OF 
AN EDUCATIONAL TELEVISION 
PROGRAM" 


JAMES J. ASHER ? anp RICHARD I. EVANS 


University of Houston 


Since the first noncommercial educational 
television station, KUHT-TV, began opera- 
tions in April, 1953, a variety of studies in- 
volving the educational television medium were 
completed. Typical reports of such studies, 
which investigate such problems as the rela- 
tive effectiveness of television vs. formal 
course instruction, or the demographic and 
personality characteristics of the educational 
television audience, are reported elsewhere 
(Adams, 1956; Carpenter, 1955; Evans: 
1955; 1956; 1957; Evans, Roney, & Mc- 


Adams, 1955; Husband, 1954; Kumata, 1956; 
Merrill, 1956; Rock, Duva, & Murray, 1957). 

Another provocative direction for research 
might be the examination of the social psy- 
chological impact of general educational tele- 
vision programming as is typified by the pro- 


ductions released to the various educational 
television stations by the Educational Tele- 
vision and Radio Center of Ann Arbor, Michi- 
gan. Such an investigation is reported in the 
present paper. 

Inherent in such an investigation are sev- 
eral interesting questions. Three are dealt 
with in the present report: 


1. To what degree are attitudes of viewers 
changed by a given educational television pro- 
gram? An over-all statement that might be 
made in relation to the impact on, attitudes of 


1In part based on data gathered in conjunction 
with a dissertation presented by the senior author of 
the present paper in fulfillment of the Doctor of 
Philosophy degree, Department of Psychology, Uni- 
versity of Houston; completed under the direction 
of the junior author of the present paper and sup- 
ported in part by a grant-in-aid awarded by the 
Educational Television and Radio Center, Ann Arbor, 
Michigan. Thanks are expressed to John Michael of 
the University of Houston Psychology Department 
for his invaluable editorial and statistical assistance, 
as well as to Albert Newhouse of the IBM Com- 
puter Center at the University of Houston. 

* Now a member of the psychology department at 
San Jose State College, San Jose, California. 


mass media messages in general is that atti- 
tude shifts are, indeed, observable. Schramm 
(1949) makes this relevant observation: 
“More than one hundred papers have now 
presented quantitative evidence that such 
change occurs, and that it can occur as a 
result of messages translated by any of the 
mass media or combination of mass and in- 
terpersonal communication.” 

2. To what degree can personality measures 
predict the direction and intensity of attitude 
shifts? Here Hovland and Weiss (1951) 
summarize the results of investigations deal- 
ing with this problem and suggest that such 
predictions of change are reported, but are 
generally topic-bound in nature. In other 
words, only “specific-to-content” rather than 
generalized predictions of attitude change ap- 
pear to have been demonstrated. A person- 
ality measure of more general applicability 
appears to be called for here. 

3. Would the credibility of the source of 
an educational television program have an 
effect on the degree of intensity of attitude 
change? For example, would a greater atti- 
tude change occur if the audience believed 
that it was being presented on a national 
commercial network rather than a local edu- 
cational television station? In general, typi- 
cal investigations of this problem as reported 
elsewhere (Janis, 1954; Wegrocki, 1934) in- 
dicate that a suggestion is more likely to be 
accepted if it is associated with a high pres- 
tige source. 


To give operational expression to the ques- 
tions raised above, an attempt was made in 
the present study to evaluate the impact of 
a specific educational television program en- 
titled Puberty in Girls, one of the typical half- 
hour programs in the series entitled People 
Are Taught To Be Different, produced by the 


166 





Impact of Educational Television 


University of Houston for the Educational 
Television and Radio Center for national dis- 
tribution. This film compared the reactions 
of three cultures with respect to the psycho- 
logical onset of puberty in girls. A Negro 
professor narrates the message while Negro 
college students illustrate the context with 
modernistic or symbolic dancing and music. 

The following hypotheses were formulated 
to reflect the. questions raised above as they 
might be dealt with in the present investiga- 
tion: 


1. Attitudes, as measured by the Osgood 
Semantic Differential (Evaluative Dimen- 
sion), toward a group of 11 typical con- 
cepts represented in the program (‘Negro 
professor,” “KUHT-TV Channel 8,” “female 
monthly cycle,” “CBS-TV Channel 11,” 
“Negro,” “public discussion of sex,” “Texas 
Southern University,” “public discussion of 
puberty in girls,” “intelligent Negroes,” “use 
of dancing as a teaching aid,” “use of un- 
usual music as a teaching aid’’), would change 
in a significantly more favorable direction as 
a result of viewing the program. 

2. Highly dogmatic individuals, as _indi- 
cated by scores on the Rokeach Dogmatism 
Scale, would resist changes in attitudes, as 
measured by the Osgood Scale, to a signifi- 
cantly greater extent than individuals who 
were relatively less dogmatic. 

3. The program when presented from a 
source of communication with allegedly high 
prestige, a major television network (the Co- 
lumbia Broadcasting System), would produce 
significantly greater attitude change than 
when it is presented from a source of alleg- 
edly lower prestige, a local noncommercial 
educational television station, KUHT-TV. 


Methodology 


Subjects, all undergraduate students, consisted of 
three classes enrolled in elementary psychology at 
the University of Houston, not differing significantly 
in essential over-all composition. The control group 
(n= 30) did not view the television program de- 
scribed earlier. Experimental Group I (n= 47), 
designated as the high prestige group, was allowed 
to believe that a major television network (Colum- 
bia Broadcasting System) was presenting the pro- 
gram, while Experimental Group II (n = 36), desig- 
nated as the low prestige group, was allowed to 
believe that the program originated from KUHT-TV, 


167 


the University’s noncommercial educational televi- 
sion station. 

The measurement of attitudes in the present study, 
as suggested above, was effected through the use of 
the Semantic Differential, Evaluative Dimension, of 
Osgood (1952), since as a seven-point generalized 
attitude scale, now widely used, it appeared to lend 
itself well to the evaluation of the impact of a mass 
medium message involving an array of concepts such 
as is involved in the one-half hour television pro- 
gram used in the present study. 

The search for a personality measure that would 
be less “topic-bound” as it potentially relates to atti- 
tude change, lead the authors of the present paper, 
as suggested above, to the Dogmatism (D) Scale of 
Rokeach (1954). Rokeach and Fruchter (1956) 
demonstrated by factor analysis that D measures the 
rigidity, authoritarianism, and anxiety of the F and 
E scales; yet dogmatism has the advantage of being 
independent of belief content. Therefore, no matter 
what the content of a communication might be, we 
would theoretically expect that personality factors 
reflected in D-scale scores might be involved in a 
tendency toward opinion or attitude change. 

One of the experimenters administered the Semantic 
Differential and Dogmatism Scales to the control and 
two experimental groups of Ss in a pretest, pre- 
sented the television program to the two experi- 
menial groups, and administered the posttest of the 
Osgood and Rokeach scales to the three groups. Ex 
posure to the program was effected in a specially de- 
signed closed-circuit viewing room for both experi 
mental groups. Evidence gained from responses of 
Ss confirmed the success of the staging of an os 
tensibly genuine telecast in each instance. In sta- 
tistically analyzing the results in the present study, 
the nonparametric techniques, chi square and rank 
order correlation, appeared to be applicable 


Results and Discussion 


With respect to the first hypothesis shifts 
in only three of 11 instances proved to be 


statistically significant. These were toward 
the following concepts: (a) “female monthly 
cycle,” which was significant in the high pres- 
tige and low prestige groups at the .10 and 
.02 levels (chi squares of 2.8 and 5.5, respec- 
tively, with 1 degree of freedom); (5) “Ne- 
gro,”’ which was significant at the .01 and .05 
levels (chi squares of 7.1 and 3.5, respec- 
tively, with 1 degree of freedom); (c) “pub- 
lic discussion of puberty in girls,” which was 
significant at the .05 and .01 levels (chi 
squares of 3.8 and 16.0, respectively, with 1 
degree of freedom). ; 

No significant changes with respect to any 
of the eleven concepts appeared in the con- 
trol group. 





168 


The fact that a shift in the attitude toward 
the concept, “Negro,” was found suggests 
that in such instances where the original atti- 
tude was probably highly crystallized through 
previous experience, as is the case in the 
South with respect to attitudes toward “Ne- 
gro,” this type of “random target” television 
program may have at least an immediate 
effect. However, impacts involving newer as- 
sociations such as the concepts, “Negro pro- 
fessor” or “dancing as a teaching aid,” may 
be difficult to effect through any single edu- 
cational television program. 

With respect to the second hypothesis, rank 
order correlations were calculated between 
Dogmatism Scale scores and difference scores 
(pre-attitude test minus post-attitude test) 
regardless of sign in both experimental groups. 

Only two correlations in 22 were significant 
within the .05 level of confidence. Only one 
of the two was in the predicted direction. 
Such findings, of course, could easily have oc- 
curred by chance. Thus the basic hypothesis 
must be rejected. 

However, another possibility would be that 
the greater the dogmatism, the more attitudes 
towards the source of the message will tend 
to determine attitude toward the content of 
the message. A correlational analysis of this 
possibility revealed no statistically acceptable 
support of it. 

Still might be that 
highly dogmatic Ss will tend to be more ex- 
treme in the intensity of their own attitudes 
None of the chi 
squares computed to examine this possibility 
supported it. 

However, the D scale might have predicted 
attitude change if the communication as a 


another possibility 


than are low dogmatics. 


whole was concerned with a more overtly 
emotionally involving content than “puberty 
in girls.” Perhaps the content of the pro- 
gram used in the present study most directly 
possessed a peripheral, non ego-involving 
character to the respondent, even though 
highly controversial content was present in a 
more subtle sense. Therefore, in future stud- 


ies utilizing the D scale as a possible predic- 
tive tool, messages might be selected that are 
more overtly ego-involving. 


This might sup- 


James J. Asher and Richard I. Evans 


ply the basis for a more sensitive evaluation 
of the D scale as a predictive measure. 

For the third hypothesis, the chi-square 
test was used to test the significance of the 
differences between the high and low prestige 
groups on positive attitude shifts (pretest 
subtracted from the posttest) as measured by 
the Semantic Differential. None of the tests 
approached significance at even the .10 level 
of confidence. 

In analyzing this portion of our results, a 
consideration of the adaptation level (AL) 
theory of Helson (1951) might prove pro- 
vocative. Did the high prestige group, for 
example, expect a typical commercial net- 
work program, but instead were given a pro- 
duction that deviated sharply from that an- 
ticipation? Perhaps the distance between ex- 
pectancy and actuality, which was probably 
greater for the high prestige group, precipi- 
tated a negative “set” toward the program, 
which in turn influenced attitudes toward it. 
Some indirect evidence reported by Asher 
(1957) suggests this possibility. It was 
shown that the low prestige group had slightly 
higher correlations and a greater number of 
significant correlations between their attitude 
toward the program as a whole and attitudes 
towards concepts within the program. Also, 
this group liked the program as a whole more 
than did the audience who expected a CBS 
program. This suggests that the prestige of 
the communication source has by no means 
a simple, direct relationship to the persuasive- 
ness of the communication. 

Another possibility that may be suggested 
here is that the very fact that the impact of 
the same program presented ostensibly over 
an educational television station was at least 
as great as it was when presented ostensibly 
over a major commercial television network 
may be regarded with great encouragement 
by participants in the educational television 
movement. It is possible that the self-con- 
scious belief of many participants in educa- 
tional television activities that the implicit 
lack of relative prestige of educational tele- 
vision stations as a source of programming, 
diminishes the impact of programming, may 


be erroneous. In fact, as suggested in an 





Impact of Educational Television 


earlier study by Evans (1957), the concep- 
tion that educational television lacks prestige 
in the eyes of the viewers may be more im- 
agined than real. 


Received July 22, 1958. 


References 


Adams, J. S. An exploratory study of viewers and 
non-viewers of educational television. Chapel Hill: 
Inst. Res. soc. Sci., Univer. of North Carolina, 
1956. 

Asher, J. J. An investigation of a group of social 
psychological factors related to the impact of an 
educational television program. Unpublished doc- 
toral dissertation, Univer. of Houston, 1957. 

Carpenter, C. R. Psychological research using tele- 
vision. Amer. Psychologist, 1955, 10, 606-610. 

Evans, R. I. The planning and implementation of a 
psychology series on a non-commercial educational 
television station. Amer. Psychologist, 1955, 10, 
602-605. 

Evans, R. I. An examination of students’ attitudes 
toward television as a medium of instruction in a 
psychology course. J. appl. Psychol., 1956, 40, 
32-34. 

Evans, R. I. An analysis of some demographic and 
psychological characteristics of an educational tele- 
vision audience. Ann Arbor: Educational Tele- 
vision and Radio Center, 1957. 

Evans, R. I., Roney, H. B., & McAdams, W. J. An 
evaluation of the effectiveness of instruction and 
audience reaction to programming on an educa- 
tional television station. J. appl. Psychol., 1955, 
39, 277-279. 


Helson, H. Perception. In H. Helson (Ed.), Theo- 


169 


retical foundations of psychology. New York: Van 
Nostrand, 1951. Pp. 348-385. 

Hovland, C., Lumsdaine, A. A., & Sheffield, F. D 
Experiments in mass communication. Princeton: 
Princeton Univer. Press, 1949 

Hovland, E. I., & Weiss, W. The influence of source 
credibility on communication effectiveness. Publ 
opin. Quart., 1951, 15, 635-650 

Husband, R. W._ Television 
learning general psychology. 
1954, 9, 181-183 

Janis, I. L. Personality correlates of susceptibility 
to persuasion. J. Pers., 1954, 22, 504-518. 

Kumata, H. An inventory of instructional television 
research. Ann Arbor: Educational Television and 
Radio Center, 1956. 

Merrill, I. R. Benchmark television-radio study, 
Part I: Lansing. WKAR-TV Res. Rep. East 
Lansing: Michigan State Univer., 1956. (Rep. No 
561M.) 

Osgood, C. E. The measurement of meaning. Psy- 
chol. Bull., 1952, 49, 197-237. 

Rock, R. T., Jr., Duva, J. S. & Murray, J. E 
Training by television: The comparative effective- 
ness of instruction by television, television record- 
ings and conventional classroom procedures. Port 
Washington, L. IL, N. Y.: Special Devices Cent., 
1957. (SDC REP. 476-02-2 [NAVEXOS P-850-2].) 

Rokeach, M. The nature and meaning of domatism 
Psychol. Rev., 1954, 61, 194-204 

Rokeach, M., & Fruchter, B. A. Factoral study of 
dogmatism and related concepts. J. 
Psychol., 1956, 53, 356-360 

Schramm, W. The effects of mass communication 
A review. Journalism Quart., 1949, 26, 307-409. 

Wegrocki, H. J. The effect of prestige suggestibility 
on emotional attitude. J. soc. Psychol., 1934, 5, 
384-394 


versus classroom for 
Amer. Psychologist, 


abnorm. soc 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


THE PSYCHOLOGIST AS AN INSTRUMENT OF 
PREDICTION 


ARNE TRANKELL 


Institute of Education, Stockholm University, Sweden 


In the Journal of Abnormal and Social Psy- 
chology Robert R. Holt (1958) has drawn 
attention to a technically important question 
which ought to be taken into consideration in 
any particular predictive enterprise. He sum- 
marized his discussion as follows: “When 
clinical methods are given a chance—when 
skilled clinicians use methods with which they 
are familiar, predicting a performance about 
which they know something—and especially 
when the clinician has a rich body of data 
and has made the fullest use of the system- 
atic procedures developed by actuarial work- 
ers, including a prior study of the bearing of 
the predictive data on the criterion perform- 
ance, then sophisticated clinical prediction 
can achieve quite respectable successes.” 

What Holt really pleads for is a combina- 
tion of statistical prediction based on stand- 
ardized tests and a clinical approach in which 
the psychologist is given opportunity to evalu- 
ate all data and in which the psychologist can 
base his own predictions on a correct insight 
into the job for which the prediction is made. 
The present author is convinced that Holt’s 
article is extremely important. He has him- 
self used the combined approach with con- 
siderable success since 1951 in the selection 
of pilots for the Scandinavian Airlines Sys- 
tem. The results of an examination of this 
work during the first four years were pub- 
lished in 1956 in a Swedish journal for avia- 
tion medicine, where the efficiency of such an 
approach was demonstrated. The result of 
the selection during six years has now been 
examined and is presented in this article. I 
believe that they form an illustration to what 
Robert R. Holt has wanted to prove. 


The Selection Task 
Since 1951 the applicants for co-pilot 
courses in Scandinavian Airlines System have 
been examined by means of a system of se- 
lection worked out by the author. This sys- 


170 


tem is based on a job analysis of the airline 
pilots. 

On the basis of this analysis the following 
list of assessment variables was made up: 


1. Maturity 

2. Self-reliance 

3. Authority 

4. Tactfulness 

5. Independence 

6. Social Adjustment 

7. Sensitivity to Criticism 
8. Panic Resistance 

9. Motor Skill 

10. Verbal Intelligence 

11. Inductive Intelligence 
12. Technical Intelligence 
13. Ability to Organize 
14. Simultaneous Capacity 


For the assessment of some of the variables 
a battery of standardized tests was used. 
Other variables were assessed in special pro- 
cedures applied in individual examinations. 
Among these were Motor Skill, Panic Resist- 
ance and Simultaneous Capacity assessed by 
means of a technique devised by the author 
and described by Langewiesche (1955). 

The individual examinations were performed 
by two or three psychologists who examined 
each applicant independently. This arrange- 
ment had a correcting as well as supplement- 
ing effect. The examinations led up to reports 
in which the psychologists described the ap- 
plicant’s characteristics as regards the assess- 
ment variables. In addition to this they rated 
the variables on a stanine scale. Aptitude as 
co-pilot and as captain were also assessed on 
the stanine scale. 

The final decision as regards each appli- 
cant was obtained at a meeting between those 
psychologists who had examined the appli- 
cant. At these meetings the reports were read 
against each other. On points where opin- 
ions differed the applicant was discussed until 





Psychologist as Instrument of Prediction 


Table 1 


Correlations and ¢ Values for the Tests 


Test Variable Roi 


Simultaneous Capacity 
Inductive Intelligence 
Verbal Intelligence 
Mechanical Comprehension 
Sensitivity 


at= x 3 = 
SVN. TN, 


Nz +Ny —2 
agreement was reached. Finally the ratings 
of the assessment variables were read, and an 
agreed estimate was arrived at for each of 
them. The company was not advised to hire 
applicants with aptitude scores lower than 
five on the stanine scale. 

During the years 1951—56 a total of 780 
applicants * were examined. Altogether 363 
applicants were assigned to co-pilot courses. 
During or subsequent to their training period, 
29 of the assignees were dismissed due to in- 
ability to fly in SAS. The dismissals were 
based on the results obtained in the co-pilot 
courses and on the opinions of the instructors 
and checkpilots of SAS. The aptitude assess- 
ments were in no case known to instructors 
and checkpilots, who had to base their opin- 
ions on their own observations of each pilot 
aspirant during and after the course. 

The validity of the selection system is ex- 
amined by comparisons between the remain- 
ing and the dismissed pilots. 


The Test Variables 


Standardized paper and pen tests were used 
in measuring the following traits: 


Inductive Intelligence 

Verbal Intelligence 

Mechanical Comprehension 

Sensitivity (personal inventory ) 
Simultaneous Capacity (cancellation test) 


1 One of the requirements to be fulfilled in order to 
be accepted for the psychological examinations was 
to have at least 350 hours’ flying experience. This 
implies that all testees had already proved themselves 
able to fly an airplane. 


0.42 + .09 
0.33 + .09 
0.28 + .13 
0.21 + .09 
—0.07 + .10 


z(X — X)?+2(¥ — Y)? 


Difference 


1.67 + .36 
1.06 + .33 
0.94 + .42 
0.73 + .32 
—0.25 + .34 


The bi-serial correlations between test scores 
and the criterion remaining-dismissed are 
shown in Table 1. The differences between 
the two groups and the ¢ values are also 
given. The results do not differ from those 
usually obtained for tests of this kind. 

Besides the standardized tests the battery 
contained a number of tasks designed to form 
the basis for individual examinations. Among 
those were an autobiographical essay and a 
questionnaire regarding childhood, adoles- 
cence, school, family life, spare time activi- 
ties, etc. Another questionnaire, Flying An- 
amnesis, contained questions about the appli- 
cant’s career as a pilot in the Air Force (or 
elsewhere), flying hours, types of aircraft 
flown, ranking in various schools, details con- 
cerning first interest in flying, emergencies, 
and incidents. 

Finally the test battery contained a com- 
plicated organizational problem, where the in- 
terplay of a number of factors had to be kept 
in mind when working out a schedule. The 
test demands a fairly high level of general in- 
telligence, as well as perseverance and self- 
reliance. It allows a great number of solu- 
tions. No standardized norms were applied 
in evaluating this test, which was regularly 
used as a subject for discussions in the indi- 
vidual examinations. 


The Assessment Variables 


The standardized tests could not provide 
material for an adequate assessment of all 
variables to be considered. All assessment 
variables were, however, considered in the in- 
dividual examinations. 





172 


In assessing the variables representing in- 
tellectual capacities the results of the stand- 
ardized tests were looked upon as hypotheses 
to be further examined. Extremely low test 
results were regularly looked upon with sus- 
picion and checked by individual discussions. 
The corrections were determined by the ex- 
aminer’s personal judgment and _interpreta- 
tion. The applicant’s ability to use his ca- 
pacities in practical life situations was thus 
under consideration before the ratings were 
looked upon as definite. 

For the assessment of Motor Skill, Panic 
Resistance and Simultaneous Capacity a pro- 
cedure called Tapping was used. On a paper 
were printed two patterns; consisting of small 
circles connected by straight lines. One pat- 
tern was for the left, one for the right hand. 
The applicant was given a pencil in each hand 
and was told to place the point of the pencil 
in one of the small circles in either pattern. 
The pencils then had to be moved from circle 
to circle following the lines. The applicant 
moved alternately the right and the left hand 
and the speed was determined by the examiner 
beating the time. The degree of difficulty 
could easily be adapted to the coordination 
level of the applicant. By increasing the 
speed the difficulty could be greatly increased 
and cause heavy stress on the applicant. 
Sudden unexpected difficulties were also in- 
troduced in order to create panic reactions. 


Arne Trankell 


Together with findings such as lowered self- 
confidence, these signs were used in assessing 
the Panic Resistance variable. Also to be 
taken into consideration in this connection 
was the applicant’s descriptions of his own 
reactions in emergencies or critical situations 
which he had experienced. Motor Skill and 
Simultaneous Capacity were assessed by ob- 
servations of the tapping behavior in its pure 
form and when complicated by intellectual 
problems to be solved simultaneously with 
the manual task. No quantitative norms 
were used and no quantitative scoring was 
done except for the stanine ratings. 

In Table 2 the evaluations of the assessment 
variables are shown. All the correlations are 
significant except for the typical captain traits 
Maturity and Tact. The variables are ar- 
ranged in accordance with the size of the bi- 
serial correlations. Those variables appear 
highest on the list which have been assumed 
as most important for a co-pilot. 


Improvement of the Validity of the Tests 


Of special interest are those variables which 
were assessed by using the results of the stand- 
ardized tests as starting points. Table 3 sets 
out side by side the bi-serial correlations for 
these tests and the corresponding coefficients 
for the assessments. In order to show to what 
extent the tests have been utilized by the psy- 


Table 2 


Correlations and ¢ Values for the Assessments 


Assessment Variable Roi 
Simultaneous Capacity 
Panic Resistance 

Self Reliance 

Motor Skill 

Social Adjustment 
Inductive Intelligence 
Ability to Organize 

Verbal Intelligence 
Mechanical Comprehension 
Independence 

Authority 

Maturity 

Tact 

Sensitivity 


0.55 + . 
047+. 
0.44+ . 
0.43 +. 
0.40 + . 
040+. 
0.36+ . 
0.32 + 
0.30 +. 
0.30 + . 
0.28 + . 
0.11+. 
0.003 +. 
—0.21+. 


Difference 


0.98 + .16 
1.08 + .21 
1.10 + .23 
0.84 + .18 
0.79 + .18 
1.19 + .27 
0.93 + .24 
0.82 + .24 
0.88 + .27 
0.64 + .20 
0.50 + .17 
0.23 + .19 
0.05 + .17 
—0.49 + .18 





Psychologist as Instrument of Prediction 


Table 3 


Comparison Between Tests and Assessments 


Correlations with 
Criterion t Values 

Variables Tested and - = 
Assessed Assessm. 


Simultaneous Capacity 0.55 
Inductive Intelligence 0.40 
Verbal Intelligence 0.32 
Mechanical Comprehension 0.30 
Sensitivity —0.21 — 


chologists, that part of the variation of the 
assessment variable which can be explained 
by the variation of the test variable has been 
indicated in the third column. As could be 
expected, the tests prove to be least utilized 
in the cases where the assessment variable is 
only partly covered by the test variable, i.e., 
Simultaneous Capacity and Sensitivity. 
According to these results, the method used 
for the assessment of certain traits of the pilot 
applicants have given more valid results than 
were achieved by means of the standardized 
tests as such. It has, however, to be observed 
that the criterion (the ability to pass certain 
courses for pilot aspirants in SAS) is not a 
measure of the trait itself. The assessment 
can take into account that sort of simultane- 
ous capacity, inductive, verbal, or mechanical 
intelligence which is of special importance for 
the pilot, something the test cannot do. It 
thus seems possible to improve the predictive 
capacity of the measurement by modifying 
the results of standardized tests in accord- 
ance with the findings and interpretations of 
examiners who are well acquainted with the 
kind of job for which the selection is made. 


The Efficiency of the System 
The efficiency of the selection system is 
measured by the correlation between the co- 
pilot stanines and the criterion remaining-dis- 
missed.* The bi-serial correlation amounts to 


2 The correlation between the captain stanines and 
the criterion amounts to 0.51 + .08. This coefficient, 
however, is of less interest, since the criterion refers 
to the efficiency as co-pilot. The partial correlation 
between the captain variable and the criterion with 


Common 


Test. Assessm. Test. Variation 


0.42 6.13 4.64 25% 
0.33 4.35 3.21 78° 
0.28 3.45 2.24 85% 
0.21 3.26 2.28 88" 
0.07 2.79 0.74 11° 


0.75 + .07. The difference between the means 
of the two groups is 1.76 + .20, which yields 
t = 8.80. 

In the recommendations to SAS the apti- 
tudes as co-pilot and captain were summa- 
rized into four categories of suitability for 
employment. The categories were as follows: 


1. Particularly suitable (neither stanine less than 5, 
total at least 14) 
. Suitable (neither stanine less than 5, 
total 10-13) 
3. Doubtful (one stanine 4 against the 
other 5 or more, one stanine 
3 against the other 7 or 
more) 
. Unsuitable (remaining combinations) 


SAS was in no case obliged to follow the 
recommendations. In 1951 the psychologists 
were in fact disregarded fairly often, above all 
in cases where the company had received in- 
formation from former employers which did 
not coincide with the opinion of the psycholo- 
gists. The experience obtained from such 
cases resulted in the psychological assessment 
later being more relied upon. In Table 4 the 
number of dismissals within the various cate- 
gories of suitability is shown. 

In Table 5 the number of dismissals within 
the different pilot-stanine categories is shown. 
The extreme classes have been pooled together 
into 9-8 and 3-1, because of the small num- 
ber of individuals in each class. 


the pilot variable held constant has also been calcu- 
lated. It amounts to —.09. The correlation between 
the captain variable and the co-pilot criterion is thus 
wholly due to the pilot qualities required of a good 
captain. 





Arne Trankell 


Table 4 


Dismissal Rate in Various Categories of 
Suitability for Employment 


Dismissals 
in Per- 
centage 


Category Dismissed 


Particularly 
Suitable 49 

Suitable 

Doubtful 59 

Unsuitable 37 


Employed 


3.7% 
6.8% 
45.9% 


Total 8.0% 


Discussion 

Ordinary selection systems are based on the 
predictive capacity of a test battery. Each 
test is correlated with criteria obtained in ear- 
lier investigations, and the correlations serve 
as a basis for a multiple regression analysis. 
Thus a system of weights is determined for 
the calculation of the final stanine scores. 
The procedure is altogether statistical. The 
advantages of such a system consist above all 
in its capacity. It makes it possible to ex- 
amine and assess the aptitude of a large num- 
ber of applicants within a relatively short 
time. 

In contradistinction to this system the se- 
lection work for SAS has the characteristics 
of a craftsman’s job. It is certainly more ex- 
pensive than the pure test system, since quali- 
fied psychologists have to examine each ap- 
plicant. It is true that standardized tests are 
used in the SAS system as well, but these 
tests serve as tools for an individual assess- 
ment, the result of which depends on the ex- 
aminer’s capacity to make use of the tools at 
his disposal. Instead of a system of fixed co- 
efficients its technique of dynamic interpreta- 
tion works with what could be designated as 
flexible weights. This means, for instance, 
that the dynamic interpretation is able to 
avoid overestimation, when a man with high 
scores in the tests has proved himself unable 
to use his capacities in practical life situations. 

There is one drawback to a system based 
on dynamic interpretations: it depends on the 


skill of the psychologists who do the work. 
The selection of examiners is therefore a prob- 
lem of the same importance as the selection of 
tests in a mechanical prediction system. Not 
only the selection, but also the training of the 
examiners is a laborious task. Parallel ex- 
aminations of the same examinees must be a 
routine procedure to control the standards 
used, and individuals employed must be fol- 
lowed up continuously in order to maintain 
and improve the validity of the predictions. 

Experiments often indicate that the inter- 
viewer (or the interview! ) cannot increase the 
validity of predictions based on tests. The 
significance of this is doubtful. It seems to 
be true, however, that the interview is less 
effective when the information used by the 
interviewer is confined to what is supplied by 
the interview itself. But this seems to be an 
almost foolish way to use an examiner, since 
it deprives him of the natural tools of the 
psychologist, e.g., the results of standardized 
tests and the techniques of individual ex- 
aminations. 

The selection system worked out by the 
author is meant to be a synthesis of a sta- 
tistical and a clinical approach. The psy- 
chologist has been given the leading role in 
this synthesis, since he knows the limits and 
efficiency of the tools. There is no magic’ in 
the fact that psychologists, when given this 
leading role, can be more effective as pre- 
dictors than batteries of tests. It is a ques- 
tion of experience and training, proficiency 
and sense for relevant facts, and last but not 


Table 5 


Dismissal Rate in Different Pilot Stanine Classes 


Dismissals 
in Per 
centage 


Pilot 


Stanine Employed Dismissed 


9-8 11 . 

82 1.2% 

126 ‘ 2.4% 

104 8.7% 
25 } 20.0% 
15 73.3% 


8.0% 





Psychologist as Instrument of Prediction 


least ability and courage to make an intelli- 
gent use of the tools of psychology. 


Received July 29, 1958. 


References 


Holt, R. R. Clinical and statistical prediction: A 
reformulation and some new data. J. abnorm. 
soc. Psychol., 1958, 56, 1-12. 


Langewiesche, W. Are airline pilots any good? Air 
Facts, 1955, 18, 29-58. 

Trankell, A. Rekryteringen av piloter i Svenska 
flygvapnet. Tidskrift i Militér Héalsovdrd, 1956, 
1, 1-30. 

Trankell, A. Erfarenheter av en metod for uttagning 
av piloter till Scandinavian Airlines System. Med- 
delanden fran Flyg och Navalmedicinska Naimnden, 
1956, No. 1. 





Journal of Applied 
Vol. 43, No. 3, 195° 


Psychology 
) 


A CODING SYSTEM FOR TOTAL PROFILE ANALYSIS 
OF THE STRONG VOCATIONAL INTEREST 
BLANK 


JOHN O. CRITES 


State University of Iowa 


Although Darley’s (1941) technique of pat- 
tern analysis of the Strong Vocational Interest 
Blank (SVIB) has been widely used in both 
clinical practice and research, the need for a 
coding system appropriate for total profile 
analysis of the SVIB has become increasingly 
apparent. As Kirk (1956, p. 309) has pointed 
out with reference to Darley’s metnod: “There 
is a need both in research and in counseling 
for total profile analysis not served by ad- 
herence to this system.” At present there is 
no method available which expresses both 
characteristics of the interest profile: the 
elevation and shape of the pattern (Cronbach 
& Gleser, 1953; DuMas, 1947). Interpreta- 
tions of interest profiles based upon standard 
scores, letter ratings, or Darley’s approach 
take into consideration the elevation of the 
interest pattern but not its shape. They ex- 
press the degree to which an individual's in- 
terests are similar to those of men engaged 
in occupations within an interest group but 
do not represent the configuration formed by 
the varying elevations across interest groups. 
That the shape of an interest pattern as well 
as its elevation may be psychologically sig- 
nificant, however, and should be considered in 
the interpretation of SVIB profiles, has been 
suggested in a study by the writer (1957). 
It would seem desirable, therefore, to de- 
vise a coding system which would account 
for the elevation and shape of the interest 
profile, not only as an aid in individual voca- 
tional diagnosis and counseling but as a tool 
in research. 


Desiderata of a Coding System 
In addition to as isomorphic a representa- 
tion of the elevation and shape of an interest 
profile as possible, the coding system should 
ideally have other characteristics. Super- 
fluous meanings associated with the code 
symbols should be kept to a minimum; the 


176 


definition of the profile must be as opera- 
tional as possible. Codes should be communi- 
cable verbally to facilitate case conference 
and other discussions of the characteristics 
of individuals with given types of profiles. 
The coding system should be exhaustive in 
nature: all possible combinations of elevation 
and shape should be codeable. Moreover, it 
should be comprehensive enough to allow 
ready manipulation, such as in the classifica- 
tion and filing of profiles. And, the system 
should be adaptable to use in research, i.e., 
subject to at least a nominal level of quanti- 
fication for purposes of statistical analysis. 

The proposed method of coding SVIB pro- 
files which follows represents an attempt to 
meet these criteria. It has been modeled after 
the coding systems devised by Hathaway 
(1947) and Welsh (1948) for the MMPI 
and therefore bears considerable resemblance 
to them. In one rather basic respect, how- 
ever, it differs. Whereas the assumption un- 
derlying the coding and interpretation of the 
MMPI is that “. . . the shape of the total 
profile is of greater significance than the 
elevation of single scores” (Hathaway & 
Meehl, 1956, p. 137), the assumption made 
in the analysis of interest patterns on the 
SVIB is that elevation is more important than 
shape. The coding system outlined below re- 
flects this distinction. 


The Coding System 

The basic structure of the coding system 
accounts for both the elevation and the shape 
of the interest pattern. The elevation of the 
SVIB profile is designated by classifying in- 
terest groups by their appropriate code num- 
bers according to type of interest pattern: 
primary, secondary, or reject. The intensities 
of the nonoccupational scales—occupational 
level (OL), masculinity-femininity (MF), and 
interest maturity (IM)—are coded in much 





Strong Vocational Interest Blank 177 


the same manner as the interest groups but 
with slightly different score ranges than those 
equivalent to the letter ratings since the 
standard scores have different meanings on 
these scales (Strong, 1943). The shape or 
configuration of the profile is indicated by 
the order of primary, secondary, and reject 
patterns from left to right positions in the 
code. Those interest groups in which pri- 
mary patterns occur are assigned to the first 
code position, secondary patterns to the sec- 
ond code position, and reject patterns to the 
third code position. Thus, the elevation and 
shape of the interest profile are represented 
in the code by type of pattern (primary, sec- 
ondary, reject) and position (from left to 
right), respectively. 

To determine primary, secondary, and re- 
ject patterns Darley and Hagenah’s (1956) 
revision of the former’s method of interest 
profile analysis is used. Instead of identify- 
ing primary, secondary, and tertiary patterns 
for all interest groups, including single-occu- 
pation groups, the newer procedure defines 
primary, secondary, and reject patterns in 
seven interest groups which include all single- 
occupation groups with the exception of the 
Musician scale. The interest groups and their 
respective code numbers are listed in Table 1. 
To follow conventional terminology the code 
numbers correspond to the interest groups on 
the SVIB profile sheet rather than to con- 
secutive numerical order. 

Unpatterned profiles, those with no pri- 
mary, secondary, or reject patterns, have not 
been assigned a code position or number. 
Rather, they are indicated simply by the ab- 
sence of code designations for the regular pat- 
terns. Provision for identifying unpatterned 
profiles in the system was made because of 
their relatively high incidence in younger age 
groups and in groups of physically mature 
but less well-adjusted clients. By coding 
these uncrystallized patterns, impetus may be 
given to their further study and explication. 
Their meaning at the present time is quite 
uncertain. 


Procedure in Coding a Profile 
To code an SVIB profile the steps are 
follows: 


1. Determine primary, secondary, and re- 
ject patterns according to Darley and Hag- 
enah’s (1956) procedure. 

2. Form the profile code. List the num- 
bers of those interest groups in which there 
are primary patterns; set these off with a 
single prime (’). Then list the numbers cor- 
responding to those interest groups which 
have secondary patterns; follow those with 
a double prime (”). Finally, list those in- 
terest groups which contain reject patterns 
and place a dash (—) after the last numeral. 
List lower numbers first when forming the 
code. 

3. Code the nonoccupational scales in the 
following manner: first, assign the symbols 
X, Y, and Z to represent the score ranges 60 
or above, 40 to 59, and 0 to 39, respectively; 
and, second, form the code from left to right 
in the sequence OL, MF, IM (if specializa- 
tion level, SL, is coded, place it between MF 
and IM). 


To illustrate the procedure consider the 
SVIB profile formed by the letter rating in 
Table 1. First, the primary, secondary, and 
reject patterns are identified. In this in- 
stance, primary patterns occur in the biologi- 
cal science, physical science, and technical 
occupational interest groups; there are no 
secondary patterns; and, reject patterns oc- 
cur in the business detail, business contact, 
and verbal-linguistic areas. Next, using the 
appropriate code numbers for each interest 
group the profile code is formed. For the 
illustrative profile the code is 124’890-.  Fi- 
nally, the code symbols for the nonoccupa- 
tional scales are added. The scores for OL, 
MF, and IM, in the example, were 62, 44, 
and 39, respectively. 
124’890-X YZ. 
other possible profiles are: unpatterned 


Thus, the full code was 
Examples of codes for some 


XYZ; single primary (business detail) 
8’X YZ; secondary only 289”"XYZ; reject 
12590-X YZ. 


only 


Discussion 


The proposed coding system approximates 


most of the characteristics which it 
ideally have. 


should 
It is operational in nature and 
objective to the extent that the judgments of 





John O. Crites 


Table 1 


Code Numbers for Interest Groups on SVIB and Illustrative Profile 


Code Interest Group 


Occupational Scales 


Illustrative Profile* 





1 Biological Sciences 


Artist, Psychologist, Architect, Physician, 





C+, B—, B—, B+, A, A, A 


Osteopath, Dentist, Veterinarian 


Physical Sciences 
Chemist 


Technical 


Mathematician, Physicist, Engineer, 


Farmer, Aviator, Carpenter, Printer, Math 
Teacher, Ind. Arts Teacher, Ag. Teacher, 


B—,B-,A,A 


A, A, B—, A, A, B—, A, B+, 
B+,B 


Policeman, For. Sv. Man, Prod. Manager 


(Group IIT) 


Social Service 


Y.M.C.A. Phys. Dir., Personnel Dir., Pub. 


Administrator, Y.M.C.A. Sec’y, Soc. Sci. 
Teacher, City Sch. Supt., Minister 


Business Detail 


Sr. C.P.A., Acct., Off. Man., Pur. Agent, 


Banker, Mortician, Pharmacist 


Business Contact 


Sales Mgr., Real Est. Sales, Life Ins. Sales, 


Pres. Mfg. Concern (Group XI) 


Verbal-Linguistic 
(Group VII ) 


Adv. Man, Lawyer, Auth.-Jr., C.P.A. 


Code Symbols for Nonoccupational Scales on SVIB Profiles 
X Nonoccupational scale score range of 60 or above 


Y Nonoccupational scale score range of 40 to 59 
Z Nonoccupational scale score range of 0 to 39. 


* C letter ratings are to left of shaded area of SVIB profile sheet. 


primary, secondary, and reject patterns are 
reliable (Darley & Hagenah, 1956). The 
coded profiles can be communicated verbally 
with a minimum of awkwardness and dis- 
tortion, although it may be too cumbersome 
to express other than the major characteris- 
tics of a profile such as abbreviating 124’890- 
XYZ to “one-twenty-four.”’ All possible pro- 
files can be coded and filed according to type 
of pattern. And, coded profiles can be used 
in research since they can be quantified on 
at least a nominal level of measurement, e.z., 
either an individual has or does not have a 
particular SVIB pattern. 

A number of research possibilities using 
coded SVIB profiles are feasible. One line of 
investigation might be the identification of 
personality characteristics which are associ- 
ated with various combinations of primary, 
secondary, and reject patterns. Study of ap- 


propriate and inappropriate interest patterns 
in relation to the self-concept should be fa- 
cilitated through the use of coded profiles 
(Super, 1954). Hypotheses concerning per- 
sonality differences between individuals who 
have the same primary but different support- 
ing secondary or reject patterns should be 
testable when profiles are coded. Possible 
“conflict” patterns, such as 20’s or 59’s which 
represent opposing and often contradictory 
occupational stereotypes and values, can be 
extensively studied. And, the meaning of un- 
patterned profiles may become clearer as their 
correlates are identified. 

A concluding thought: before the coding 
system is used in research it would seem ,de- 
sirable to have it fully evaluated and modi- 
fied, if necessary, in order to promote stand- 
ard procedures in experimentation and, ulti- 
mately, comparability of results. 





Strong Vocational Interest Blank 


Summary 


A coding system for total profile analysis 
of the SVIB was proposed which would rep- 
resent the elevation and shape of the interest 
pattern as well as have other characteristics 
desirable for definition, communication, filing, 
and research. The basic structure of the sys- 
tem was outlined, the steps in coding a pro- 
file were delineated, and an illustration of the 
procedure was given. Some possible areas of 
research using the coded SVIB profiles were 
briefly discussed. 


Received August 4, 1958. 


, References 


Crites, J. O. Ability and adjustment as determinants 
of vocational interest patterning in late adoles- 
cence. Unpublished doctoral dissertation, Colum- 
bia Univer., 1957. 


Cronbach, L. J., & Gleser, Goldine C. Assessing 


179 


similarity between profiles. 
50, 456-473. 

Darley, J. G. Clinical aspects and interpretation of 
the Strong Vocational Interest Blank. New York: 
Psychological Corp., 1941. 

Darley, J. G., & Hagenah, Theda. 
est measurement. Minneapolis: 
sota Press, 1956. 

DuMas, F. M. On the interpretation of personality 
profiles. J. clin. Psychol., 1947, 3, 57-64 

Hathaway, S. R. A coding system for MMPI pro- 
files. J. consult. Psychol., 1947, 11, 334-337. 

Hathaway, S. R., & Meehl, P. E. Psychiatric im- 
plications of code types. In G. S. Welsh & W. G. 
Dahlstrom (Eds.), Basic readings on the MMPI 
in psychology and medicine. Minneapolis: Uni- 
ver. Minnesota Press, 1956. Pp. 136-144. 

Kirk, Barbara A. Review of Vocational interest 
measurement. J. counsel. Psychol., 1956, 3, 309. 

Strong, E. K., Jr. Vocational interests of men and 
women. Stanford: Stanford Univer. Press, 1943. 

Super, D. E. The measurement of interests. J. 
counsel. Psychol., 1954, 1, 168-171. 

Welsh, G. S. An extension of Hathaway’s MMPI 
coding system. J. consult. Psychol., 1948, 12, 
343-344. 


Psychol. Bull., 1953, 


Vocational inter 
Univer. Minne- 





Journal of Applied 
43, No. 3, 19 


Psychology 
59 


Vol 


ART WORK VERSUS PHOTOGRAPHY: 
AN EXPERIMENTAL STUDY 


CHARLES WINICK 


Graduate School of Business, Columbia University 


Various attempts to measure the compara- ' 


tive effectiveness of photography and art work 
methods of illustrating advertisements have 
generally found photography to be superior, 
usually because of its greater realism (Which 
Ad Pulled Best?: 1947, 1949, 1950, 1951). 
A few studies have found art work superior 
(Which Ad Pulled Best?: 1950, 1951, 1952, 
1957). Photography illustrations of adver- 
tisements appearing in a business weekly 
(Best Read Industrial Advertisements, 1953), 
an industrial magazine (Starch, 1954), and 
a trade paper (DeWolf, 1954), have been 
reported to be superior to comparable art 
work illustrations. 

The playback method in which a respondent 
is asked what he recalls from a given adver- 
tisement in a specific issue of a magazine has 
also been used to measure the comparative 
efficacy of different methods of illustration. 
One measure used in playback is Proved 
Name Registration, which is a score repre- 
senting the proportion of readers of a given 
issue of a magazine which can confirm having 
seen an advertisement by recalling one or 
more of its copy points. In food factorization 
on Proved Name Registration, there was a 
22% penalty for sketch art compared with 
photography. Advertisements with four color 
photo realism averaged 23% in playback of 
beauty, whereas advertisements with sketch 
art averaged 15%. 


Procedure 


In order to get empirical data on the comparative 
effect of photographs and art work as methods of 
illustrating consumer magazine advertising, it was 
decided to use the accordion method of paired com 
parisons, in which two paired groups of subjects 
would each be shown an accordion folder. Each 
folder consisted of a series of four advertisements, 
with the dimension of art work and photography 
varied in only one of the four advertisements and 
everything else held constant: 


Accordion Folder 1 
Accordion Folder 2 


180 


Thus, in the Accordions 1 and 2, the advertisements 
X, Y, and Z, would be exactly identical in both 
accordions. Advertisements A and A’ would have 
the identical text, but A would be illustrated by art 
work and A’ by a photograph of ¢xactly the same 
situation or scene. In order to ‘maximize compara- 
bility, experienced commercial artists and protog- 
raphers did both photographs and art work as re- 
quired. The 4 test advertisements were taken from 
a 1955 issue of Life magazine, and the nontest 
advertisements also came from the same magazine. 
Three were illustrated by photographs and one by 
art work, in the advertisements which actually ap- 
peared in the magazine. Art work illustrations were 
prepared for the three advertisements which had 
photographs and a photograph was prepared for the 
advertisements which had an art work illustration 
For each advertisement, there were thus two forms 
available: one with an art work illustration and one 
with a photograph illustration 

The advertisements were selected because they were 
for products which had considerable general appeal, 
and because the illustrations covered a range of sub- 
ject matter. They were for a gasoline, a soft drink, 
a toothpaste, and an instant coffee. In each case, 
the illustration was fairly large and covered over 
half the page. The gasoline illustration (A and A’) 
of several dogs chasing each other, the soft 
drink illustration (B and B’) was of a man drinking 
soda, the toothpaste illustration was of a smiling 
stenographer typing in an office (C and C’) and the 
coffee illustration (D and D’) was of a cup of coffee 
with a man sniffing it. The nontest advertisements 
(X VY Z) were different for each of the 4 
of advertisements under experimental study. 

A modified probability sample of 962 adults was 
selected, in the Metropolitan New York area. There 
were 784 females and 178 males in the sample 
Women were oversampled because of the extent to 
which the products advertised are bought by women. 
Interviews were conducted in the respondents’ homes, 
and averaged about 35 minutes in length. 

The subject population was divided into two 
matched groups of 481 each. The groups were 
matched on age and on socioeconomic and family 
status. Each respondent in the two groups of 481 
thus saw either the art work or the photography 
version of each The advertisements 
in the were presented in syster.atically 
varied arrangements (k!), in order to eliminate the 
bias which might be created by following the same 
sequence of presentation for all Ss. 


was 


sets 


advertisement 
accordion 


The Ss were asked to rank each of the four adver- 
tisements in the accordion presented to them as first, 





Art Work Versus Photography 181 


second, third, and fourth, on the dimensions of which 
advertisements they liked the most and which adver- 
tisements they felt were easiest to believe. The 
sequence of these two questions was rotated in order 
to avoid any possible bias. After these two ques- 
tions had been asked, the interviewer then took the 
accordion away and the Ss were asked to describe 
everything they recalled in each advertisement. The 
number of percepts recalled correctly from each illus- 
tration was counted and the advertisement which 
had the greatest number of percepts correctly recalled 
was given Rank 1, the advertisement with the second 
greatest number of percepts Rank 2, and so forth 
This provided the third dimension of recall, on which 
the two groups of advertisements were ranked. 

The number of ranks, from 1 to 4, which each 
advertisement had on each of the three dimensions 
of: (a) liked most (an inferred measure of impact), 
(b) most believable, and (c) recall, was tabulated 
After totalling each advertisement’s rank, a chi-square 
test was conducted in order to test the significance 
of the differences between the two versions of each 
paired advertisement on each of the three dimensions 
studied. 

Due to the nonparametric nature of the data and 
the nature of the ranking process, it was possible 
to group and reclassify the data in order to get a 
more meaningful test of underlying patterns without 
interfering with the assumptions of the chi-square 
test. Ranks 1 and 2 on each dimension were grouped 
together into one larger rank, as were Ranks 3 
and 4. The four ranks thus became two ranks, and 
were compared with each other. The chi-square test 
was used to determine the significance of the differ- 
ence between the groups of combined ranks 


Results 

The chi-square and p values obtained are 
shown in Table 1. 

By and large, an illustration for an adver- 
tisement which ranks high in one of the three 
dimensions studied appears to have some 
tendency to rank high on the other two di- 


mensions, suggesting the operation of a kind 
of halo effect. 

In only one of the four test advertisements 
(A and A’) did art work appear to be pre- 
ferred, and then only on the dimension of 
believability (p< .05). This was perhaps 
due to the humorous and attention getting 
nature of the cartoon illustration showing 
running dogs. This advertisement’s feeling 
of movement would probably help it to stand 
out from the pages of the magazine as a 
reader riffled through them. It would prob- 
ably have, therefore, obtained more initial 
attention from an actual reader of the maga- 
zine than was possible in this experiment. 

The most clear-cut superiority of photog- 
raphy over art work was in the second adver- 
tisement; on the dimension of most liked, 
the photograph was preferred to art work 
(p < .001) and on believability and recall 
the degree of preference was also significant 
(p< .01). This result was in line with cur- 
rent advertising practice, which has empha- 
sized the effectiveness of photographs of food 
and drink. 

The third advertisement, with an illustra- 
tion of a stenographer smiling in an office, was 
more believable in the photography version 
(p < .02) than its art work version. It might 
be speculated that this result occurred because 
interiors presented by photographs permit 
relatively rapid recognition and pemit the per- 
ceiver to relate them to previous experience. 

The fourth advertisement, with a relatively 
realistic scene of a man with a cup of coffee, 
also displayed a high degree of superiority for 
photography over art work, on the dimension 


of most liked (p < .001). On the dimensions 


Table 1 


t 


Chi-Square and p Values for Ranked Paired Advertisements on the Dimensions of Most Liked, 
Believability, and Recall 


Most Liked 


Subject of Chi 
Advertisement square 
A. Gasoline .706 
B. Soft Drink 19.550 
C. Tooth Paste 1.008 
D. Coffee 28.100 


Believability Recall 


Chi Chi 


square square 


4.302 808 
7.886 7.923 
5.928 r .967 

312 3.070 





182 


of believability and recall, the degree of supe- 
riority of photography was not statistically 
significant (p < .10). 

Sex and socioeconomic status did not ap- 
pear to be significantly associated with a 
preference for any of the four’ pairs of adver- 
tisements shown or with the three variables 
studied. 

Discussion 

On the basis of these four pairs of adver- 
tisements, it would appear that photographic 
representations of human beings, edibles, and 
interiors, are easier to identify with and recall 
details of, and provide a clearer and more 
realistic visual demonstration of their subject 
matter’s content and meaning than does an 
art work representation. 

Art work would appear, on the basis of 
these results, to be particularly appropriate 
where an unusual kind of attention getting 
illustration is needed. Other areas of possible 
art work superiority would be the difficulty 
of photographing certain situations, or the 
difficulty of photographing something which 
has yet to happen or which has already hap- 
pened. It is of course fallacious to assume 


that there is any one kind of photographic or 


any one kind of art work illustration. There 
are many different ways of photographing an 
object or person, and some art work may 
even be more representational than 
photographs. 


some 


Summary and Conclusions 


Four paired advertisements, one using pho- 
tography and the other using art work, were 


Charles Winick 


shown to a matched sample of 962 adults in 
the New York City area. The Ss ranked 
each advertisement on the dimensions of most 
liked, believability, and recall. No sex or 
socioeconomic differences emerged. Statisti- 
cally significant differences were found for the 
photographic version of three advertisements, 
respectively showing a man, a woman in an 
office, and a man drinking coffee. Art work 
was favored in one advertisement, which semi- 
humorously showed a dog in motion. 

Any decision to use either art work or pho- 
tography for a communication depends on 
many factors, including the object to be re- 
produced, the medium of communication, the 
effect desired, and the associated text. The 
results of this study, based on advertisements 
from one consumer magazine, must be inter- 
preted with caution. 


Received August 18, 1958. 


References 


Best Read Industrial 
Outpull Drawings. 
(10), 174. 

DeWolf, J. Why these trade ads have visual im- 
pact. Printers’ Ink. 1954, 246(9), 42-45. 

Starch, D., et al. Tested Copy. 1954, No. 65, 2-3 

Which Ad Pulled Best? Printers’ Ink, 1947, 219(2), 
40; 1949, 228(7), 36; 1949, 229(5), 39; 1949, 
220(8), 42; 1950, 232(3), 37; 1951, 236(4), 35; 
1951, 237(2), 58. 

Which Ad Pulled Best? Printers’ Ink, 1950, 233(12), 
27; 1951, 237(5), 58; 1952, 239(5), 39; 1957, 
258(4), 45. 


. Photos 
1953, 38 


Advertisements . 


Industr. Marketing, 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


SELF-PERCEPTIONS OF FIRST-LEVEL SUPERVISORS 
COMPARED WITH UPPER-MANAGEMENT PER- 
SONNEL AND WITH OPERATIVE LINE 
WORKERS 


LYMAN W. PORTER 


University of California 


The psychological position of the first-level 
supervisor, or foreman, has long been an im- 
portant topic in personnel psychology. The 
problem of just where the foreman fits into 
the organizational setup and how he relates 
to those in other positions in the organization 
has been considered in a number of different 
ways (Gardner & Whyte, 1945; Roethlis- 
berger, 1945; Turner, 1954; Wray, 1949). 
For example, Roethlisberger, in discussing the 
foreman’s situation when he is interacting 
with other types of employees, notes that: 
“Nowhere in the industrial structure more 
than at the foreman level is there so great 
a discrepancy between what a position ought 
to be and what a position is. This may ac- 
count in part for the wide range of names 


which foremen have been called—shall we say 
“informally”—and the equally great variety 
of definitions which have been applied to 
them in a more strictly formal and legal 


sense” (1945). Roethlisberger himself refers 
to the foreman as “master and victim of 
double talk” and “victim, not monarch, of 
all he surveys.” Gardner and Whyte (1945) 
refer to the foreman as “the man in the mid- 
dle,” and Wray (1949) has called the fore- 
man the “marginal man of industry.” 

All of these investigators point to the fact 
that the foreman occupies a unique position 
in the formal organization. On the one hand, 
he is legally a part of management and re- 
ceives his orders and directions from manage- 
ment. When he interacts upward, he interacts 
with other management personnel. When he 
interacts downward, however, his position is 
different from that of any other in manage- 
ment, because the employees he directs are 
operative line employees, and, therefore, non- 
management. Thus, he directs people who 
are not part of his own group or part of the 


183 


group to which his superiors belong. For this 
reason his position is unique among manage- 
ment positions. 

Another distinctive feature of the first-level 
supervisor’s position is the fact that typically 
he was formerly a part of the operative group 
which he now must direct. Other members 
of management usually enter the organization 
as part of management and continue in that 
same capacity. They do not have to change 
their allegiance as they advance, and they 
continue to supervise from within the same 
group in which they started. This means that 
the foreman has additional personnel relations 
problems in his work not faced by most 
upper-management personnel. 

The above considerations indicate that the 
first-level supervisor has an involved task in 
gaining the approval of those with whom he 
works. If he tries to follow directives from 
above in such a way as to give maximum 
satisfaction to upper management, he may 
decrease his popularity and effectiveness with 
the men he supervises. On the other hand, 
if he tries too much to carry out his duties 
in a way that will most please those under 
him, he may not receive the maximum ap- 
proval from his management superiors. Any 
person who supervises and is supervised faces 
this problem to some extent, but it becomes 
most acute for those at the foreman level. 
The foreman’s position, then, is one that is 
psychologically perhaps the most difficult in 
the entire organization. Because of the varied 
nature of the expectations that others hold 
for him, the foreman does not have an easy 
task in trying to maintain a clear-cut self- 
perception. Since a person’s self-perception 
is probably strongly influenced by the role 
demands he perceives operating in his work 
situation, it may be instructive to compare 





184 Lyman W. Porter 


the self-descriptions of first-level supervisors 
with those above them (upper management) 
and with those below them (line workers). 
The present study is concerned with these 
comparisons. 


Method 


The instrument used in this study to obtain the 
self-perceptions was a 64-pair forced-choice adjective 
check list developed by Ghiselli and used in previous 
studies (Ghiselli, 1954; Porter, 1958; Porter & 
Ghiselli, 1957). 

The self-description inventory was completed by 
172 first-level supervisors, 291 upper-management 
personnel, and 320 operative line workers. For the 
purpose of this study, “first-level supervisors” are 


Table 1 


Items Differentiating Upper-Level Management 
Personnel and First-Level Supervisors 


Upper-Level 
First-Level Management 
Supervisors Personnel 
See themselves as: See themselves as: 
planful resourceful 


deliberate sharp-witted 


Do nol see themselves as: 


calm 
fair-minded 
steady 

respe msible 
civilized 
self-controlled 
logical 
patient 

honest 


moor ly 
stubborn 
conceited 
stingy 
touchy 
dreamy 
nervous 
careless 
egotistical 
evasive 
selfish 
self-centered 
disorderly 
fussy 
opinionated 
excitable 
impatient 


sincere 
thoughtful 
sociable 
reliable 
dignified 
imaginative 
adaptable 
sympathetic 
generous 


Do not see themselves as: 


affected 

co Id 
infantile 
shallow 
defensive 
dependent 
intolerant 
foolish 


apathetic 


despondent 
weak 

rude 
rattle-brained 
submissive 
pessimistic 
sly 
irresponsible 


defined as those who directly supervise operative 
line workers, regardless of the specific title used by 

particular organizations. “Operative line workers” 
are all those who are on the bottom level of an 
organization and who have no supervisory duties, 
and “upper management” consists of all management , 
personnel above the first-level supervisors. The upper- 
management group thus consists of top management 
people—presidents, vice-presidents and officers of 
similar rank; and middle management people— 
operating division and department heads and various 
staff personnel such as personnel managers and pur 

chasing agents. All three samples of Ss were drawn 
from organizations heterogeneous as to geographical 
location and nature of enterprise. Although the 
inventory was administered in a variety of circum- 
stances, none of the Ss was familiar with the par- 
ticular uses that would be made of his answers to 
the check list. Therefore, it is probable that no 
systematic “set” other than that of accurate self- 
description was operating for all Ss within any of 
the three personnel categories formed for this study 

However, any given S may have used a set other 
than a self set, and hence the composite self-descrip- 
tions obtained for each category may be made up 
of persons having somewhat varying sets 


Results and Discussion 


The responses of the 783 individuals par- 
ticipating in the study were analyzed for each 
of the 64 pairs of adjectives. Table 1 pre- 
sents the 28 pairs that differentiated between 
first-level supervisors and upper-level manage- 
ment personnel at the .05 level of confidence 
or better. Eleven of the pairs are composed 
of favorable adjectives, and the other 17 of 
unfavorable terms that “least describe” the 
individual. Table 2 presents the 14 items 
that differentiated the first-level supervisors 
from line workers at the .05 confidence 
level or better. Nine of these pairs are 
composed of favorable adjectives, and five of 
“least descriptive” unfavorable traits. 

\s has been noted in a previous paper, 
when results are presented in this manner, 
“the differences are relative, and do not neces- 
sarily indicate that one adjective in a pair 
was favored by the majority of one group 
and the other adjective by a majority of the 
other group” (Porter, 1958). In many in- 
stances, a majority of each group actually 
favored the same adjective, but the size of 
the majority was significantly greater in one 
of the groups. Therefore, one group rela- 
tively more often chose one of the adjectives, 
and the other group relatively more often 





Self-Perceptions 185 


chose the other adjective. Also, when a per- 
son selected one word in a pair, he was not 
necessarily rejecting the other word in the 
pair, but was only indicating that the chosen 
word was more or less descriptive of him than 
the other word. Additionally, it should be 
pointed out that the specific list of traits 
obtained in the comparisons was in part a 
function of the specific words contained on 
the check list. However, 128 adjectives (64 
pairs of words) constituting a wide range of 
typical traits were available for choice, and 
the results obtained should not be strongly 
biased due to a particular sampling of words 
contained in the inventory. 

Examination of Table 1 reveals that the 
first-level supervisors, if contrasted with upper 
management, tended to perceive themselves in 
terms that indicate a careful and controlled 
approach toward their job and toward other 
people in their work environment. Among 
the favorable adjectives, the supervisors rela- 
tively more often checked words like “planful,” 
“deliberate,” “calm,” and “‘self-controlled,” 
while the upper-level management personnel 
for these same pairs relatively more often 
checked “resourceful,” ‘‘sharp-witted,”’ ‘‘sin- 
cere,’ and “imaginative.” Among the un- 
favorable adjectives, traits such as “moody,” 
“stubborn,” “careless,” ‘evasive,’ ‘‘disor- 
derly,” “fussy,” and “opinionated,” seem to 
characterize the type of person that a super- 
visor does not see as himself, when the com- 
parison is with upper-level managers. Un- 
favorable traits relatively more often checked 
by the higher-management personnel for these 
same pairs include “affected,” “cold,” “fool- 
ish,’ “despondent,” “rattle-brained,” “sub- 
missive,’ and “pessimistic.” The favorable 
and unfavorable traits taken together show 
that the typical supervisor sees himself as a 
more conservative person than does the typi- 
cal upper-management person. The super- 
visors seldom tended more often to check 
an adjective that indicates independence or 
strong aggressiveness. The upper-manage- 
ment personnel, on the other hand, seemed 
relatively more frequently to check adjectives 
that gave a picture of greater enterprise, origi- 
nality, and boldness. 

The results presented in Table 2 show how 
the self-perceptions of the first-line super- 


visors compared with those of the people they 
direct, the operative line workers. It can be 
seen from Table 2 that the items relatively 
more characteristic of supervisors in this com- 
parison are ones that give somewhat the same 
picture of these individuals as was found in 
Table 1, even though their descriptions are 
now being contrasted with an entirely different 
group. Nine of the 14 differentiating items 
in Table 2 were also items found in the Table 
1 comparisons, and on six of these nine items 
the supervisors differed in the same way from 
operative workers as they did from upper 
management. In other words, on a majority 
of items that were constant to both compari- 
sons, there was not a trend from upper man- 
agement to supervisors to operative workers; 
supervisors, instead, seem a group set apart 
in the same way from both men above and 
below them in the organization. They more 
often saw themselves as deliberate, fair- 
minded, steady, responsible, logical, and not 
dreamy, instead of sharp-witted, thoughtful, 
sociable, reliable, adaptable, and not depend- 
ent, when compared either to management or 
to line operatives. Only on rude-self-centered, 
rattle-brained-disorderly, and submissive-fussy 


Table 2 


Items Differentiating First-Level Supervisors 
and Line Workers 


First-Level 
Supervisors Line Workers 


See themselves as: > themselves as: 


energetic ambitious 
practical industrious 


deliberate sharp-witted 


clear-thinking 
fair-minded 
steady 
modest 
responsible 
logical 


Do not see themselves as: 


dreamy 

rude 
rattle-brained 
submissive 
cynical 


efficient 
thoughtful 
sociable 
pleasant 
reliable 


adaptable 


Do not see themselves as: 


dependent ’ 
self-centered 
disorderly 

fussy 


aggressive 





186 


did supervisors differ from line operatives in 
the same direction as upper-management per- 
sonnel had differed from them. Just as super- 
visors picture themselves in conservative and 
careful terms in .comparison with upper- 
management personnel, they likewise tend to 
picture themselves in these same terms in 
comparison with operative personnel. There 
are other areas in which they also seem to 
differ from line workers. They do not ap- 
pear as concerned with being gregarious and 
friendly, and also seem less submissive and 
flexible. In short, foremen seem to be espe- 
cially conscious of a supervisory role. 

The study as a whole tends to show that 
the self-perceptions of supervisors reflect their 
unique position in the structure of organiza- 
tions. Their self-descriptions show certain 
differences from those of men they direct, 
but they also show somewhat the same differ- 
ences from those of men who direct them. 
Supervisors do not differ from subordinates 
in the same way that their superiors differ 
from them. Since their number one duty is 
direct supervision, their self-perceptions may 
be more acutely affected by the role demands 
of this type of activity than are the self- 


perceptions of upper-management people who 
also have supervisory duties but in addition 
have other more general administrative func- 


tions. If the role demands are largely respon- 
sible for shaping the self-perceptions of first- 
level supervisors, then the self-descriptions 
provide data on how the supervisors interpret 
the role demands. The general picture of 
cautious individuals that seemed to emerge 
from the findings may be indicative of the 
psychological position as well as the strictly 
formal position in which they see themselves 
in their organizations. Being “marginal men” 
or “men in the middle,” both formally and 
psychologically, they may reflect this situa- 
tion by seeing themselves as individuals who 
act with restraint in carrying out their super- 
visory functions. 


Lyman W. Porter 


Summary 


The self-perceptions of 172 first-level super- 
visors were compared to those of 291 upper- 
management individuals and to 320 operative 
line workers. Ss were employed by a wide 
variety of industrial and business organiza- 
tions, with the self-descriptions being obtained 
by administration of a 64-item forced-choice 
adjective check list. The items that differen- 
tiated between supervisors and upper-manage- 
ment personnel tend to show that foremen 
view themselves as more conservative and 
cautious individuals in comparison with those 
above them in management. When super- 
visors’ self-descriptions are compared with the 
self-descriptions of operative line workers, 
similar results occur; supervisors appear to 
view themselves as more careful and restrained 
individuals than do operative workers. There 
thus does not appear to be a consistent trend 
in self-perceptions from upper-level managers 
to supervisors to line workers; instead, super- 
visors’ self-perceptions seem to show that 
these men are a group different in somewhat 
the same way from both those above them 
and those below them in the organizational 
hierarchy. 


Received August 25, 1958 


References 


Gardner, B. B., & Whyte, W. F. The man in the 
middle: Positions and problems of the foreman. 
Appl. Anthrop., 1945, 4, 1-28. 

Ghiselli, E. E. The forced-choice technique in self- 
description. Personnel. Psychol., 1954, 7, 201-208. 

Porter, L. W. Differential self-perceptions of man- 
agement personnel and line workers. J. appl. 
Psychol., 1958, 42, 105-108. 

Porter, L. W., & Ghiselli, E. E. The self perceptions 
of top and middle management personnel. Per- 
sonnel Psychol., 1957, 10, 397-406. 

Roethlisberger, F. J. The foreman: Master and vic- 
tim of double talk. Harv. bus. Rev., 1945, 23, 
283-298. 

Turner, A. N. Foremen—key to worker morale 
Harv. bus. Rev., 1954, 32, 76-86. 

Wray, D. E. Marginal men of industry: The fore- 
men. Amer. J. Sociol., 1949, 54, 298-301. 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


SUBLIMINAL PERCEPTION: SOME NEGATIVE FINDINGS 


ALLEN D. CALVIN 


Hollins College 


AND KAREN S. DOLLENMAYER 


Northwestern University 


The recent reports in the popular press con- 
cerning the claim that stimuli below the level 
of conscious awareness can influence behav- 
ior have had remarkable repercussions with 
both Congress and the FCC investigating the 
problem. The claims by a commercial or- 
ganization that they had succeeded in in- 
creasing the sales of Coca-Cola 18% and that 
of popcorn over 50% by flashing subliminal 
messages at 1/3000 of a second during a mo- 
tion picture program was the principal cause 
of concern. The major television networks 
have respcnded by banning the use of such 
techniques. 

McConnell et al. (McConnell, Cutler, & 
McNeil, 1958) have just completed an ex- 
cellent comprehensive review of the experi- 
mental evidence relating to this problem. 
They conclude by saying, “One fact emerges 
from all of the above. Anyone who wishes to 
utilize subliminal stimulation for commercial 
or other purposes can be likened to a stranger 
entering into a misty, confused countryside 
where there are but few landmarks. Before 
this technique is used in the market place, if 
it is to be used at all, a tremendous amount 
of research should be done, and by com- 
petent experimenters” (p. 237). 


Method 


Subjects. female undergraduate students 
served as Ss. 

Apparatus. A Gerbrands’ tachistoscope. The lamps 
used for illumination in our model of the Gerbrands 
tachistoscope are 4 watt, daylight, fluorescent lamps. 
These lamps are operated on an ignition voltage of 
approximately 250 volts d.c. and a filament voltage 
of 6.3 volts a.c. The normal operating current of 
each lamp is approximately 130 mils. There are 
four lamps in the tachistoscope, 2 for the exposure 
field and 2 for the pre-exposure field. 

Procedure. The Ss were seen in individual experi- 
mental sessions where they were given the following 
instructions: 


Sixty 


We are interested in investigating the possibility 
of ESP (telepathy). I want you to look into this 
machine [Gerbrands’ tachistoscope]. Do not re- 
move your eyes from the machine until I tell you 
to do so. In the machine you see a card with two 
circles on it. The left one is marked L and the 
right one is marked R. We want to find out if 
you can guess which of the two circles is correct 
on a particular trial. The correct circle will be so 
designated on the back of the card, but you, of 
course, will not be able to see the designation. 
After I say “ready” you will hear a click. I want 
you to tell me after the click whether you think 
the left or the right circle is correct for that trial 
If you think that the left one is correct, say “left,” 
and if you think that the right one is correct, say 
“right.” 


Half of the Ss were then told: 


I will tell you if your choice is correct or in- 
correct. There will be ten trials. Between trials 
there will be a brief pause while a new card is in- 
serted. Are there any questions? 


With each click the words “choose left” or “choose 
right” made up of block letters 7 of an inch high 
were flashed in the center of the screen. Whether 
“choose left” or “choose right” was flashed was pre- 
determined by a Gellerman (1933) order. The cor- 
rect circle on any trial was, of course, the one flashed 
in the message At the conclusion of the experi 
mental session, each S was asked for a verbal report 

A three by two factorial design was used. The Ss 
were assigned randomly to the conditions such that 
an equal number of Ss were in each condition. There 
were three exposure speeds, .01 second, .02 second, 
and .03 second, and half the Ss at each speed were 
told when they were correct (hereafter referred to as 
the TWC Ss) while the other half (NTWC) were 
given no knowledge of the correctness of their choices 


Results and Discussion 


The mean numbers of correct choices for 
each group are presented in Table 1. 

An analysis of variance was conducted and 
the over-all F was not significant. Other 
analyses indicated that none of the groups 
differed significantly from chance which, of 
course, was five correct choices. 

Although none of the groups exceeded 





Allen D. Calvin and Karen S. Dollenmayer 


Table 1 


Mean Number of Correct Choices 


Group 01 02 
TWC 4.9 5.0 
NTWC 4.1 5.0 


chance expectations, four Ss made nine or 
more correct choices. Making nine correct 
choices out of 10 is significant at about the 
1% point if the S had been selected before- 
hand. All four Ss indicated during their ver- 
bal report that they had been able to read the 
words. One of the four high-scoring Ss was 
in the TWC .02 group, one in the TWC .03 
group, one in the NTWC .02 group, and one 
in the NTWC .03 group. 

In addition to the four mentioned above, 
six other Ss reported that they could read the 
words although usually not until the later 
trials. Their scores were 5, 6, 7, 7, 8, and 8. 
The rest of the Ss did not report seeing any- 
thing except the circles. 

Our results thus indicate that under the 
conditions of the present experiment no evi- 
dence for subliminal perception was found. 
This obviously does not mean that subliminal 
perception cannot occur under other condi- 
tions, and since the commercial organization 


refuses to release the details of its experi- 
mental procedure (McConnell, 1958), no di- 
rect comparison between their findings and 
ours is possible. Nevertheless, it seems rea- 
sonable to assume that the striking findings 
claimed by the commercial organization at an 
exposure speed of 1/3000 of a second are due 
to some artifact and are not a genuine in- 
stance of subliminal perception. 


Summary 

Sixty female undergraduates served as Ss 
in a study designed to investigate subliminal 
perception. Speed of stimulus presentation 
and knowledge of results were varied in a 
three by two factorial design. No evidence of 
subliminal perception was obtained. Impli- 
cations of these findings were discussed. 


Received September 3, 1958. 


References 


Gellerman, L. W. Chance orders of alternating 
stimuli in visual discrimination experiments. J. 
genet. Psychol., 1933, 42, 207-208 

McConnell, J. V. Subliminal stimulation: An ap- 
praisal of recent developments. Paper read at 
APA, Washington, D. C., September, 1958. 

McConnell, J. V., Cutler, R. L., & McNeil, E. D 
Subliminal stimulation: An overview. Amer. Psy- 
chologist, 1958, 13, 229-242. 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


EVALUATION OF TRAINING IN CREATIVE 
PROBLEM SOLVING’ 


ARNOLD MEADOW anp SIDNEY J. PARNES 


University of Buffalo 


A training method widely employed in in- 
dustry, government, and education is the 
creative problem-solving method outlined by 
Osborn (1957). The present study was de- 
signed to provide a systematic experimental 
test of the effects of a 30-hour training course 
in creative problem solving which utilizes Os- 
born’s brainstorming and related methods. 

Examination of the literature in the area of 
creative thinking indicates four groups of rele- 
vant studies. A first series comprises studies 
attempting to differentiate creative from non- 
creative individuals by means of tests of cog- 
nitive functioning, by personality measures, 
and by biographical data analysis (Creative 
Education Foundation, 1958). 

A second series attempts to determine the 
effects of various factors postulated to inhibit 
productive thinking. Among these are studies 


evaluating the effects of pathological person- 


ality syndromes, experimentally induced anx- 
iety, and experimentally induced set (Rapa- 
port, Gill, & Schafer, 1945-46; Youtz, 1955). 

Studies comparing individual and group 
problem solving procedures (Lindzey, 1954; 
Taylor, Berry, & Block, 1957; Taylor & Mc- 
Nemar, 1955) and studies evaluating a lec- 
ture and a workshop in creative thinking 
(Gerry, DeVeau, & Chorness, 1957; True, 
1957) comprise the third and fourth bodies 
of literature. 


Hypotheses 


In the course at the University of Buffalo 
the procedures in Osborn’s textbook (1957) 
are described, and students are given practice 
in their application (Parnes, 1958). The brain- 
storming principle is emphasized throughout 
the course. The basic thesis of this principle 
is that creativity is encouraged by the tem- 
poral segregation of hypothesis formation and 

1This study was financed by a grant from the 
Creative Education Foundation. The IBM Corpora- 


tion provided the programing and computations re- 
quired by the statistical analysis. 


the judicial evaluation of the adequacy of hy- 
potheses. 

In the attempt to evaluate the effects of 
the course in creative problem solving, three 
hypotheses were proposed for experimental 
testing: the method employed in the course 
produces a significant increment (a) in 
quantity of ideas, (6) in quality of ideas, 
and (c) in three personality variables— 
need achievement, dominance, and self-con- 
trol. The variables embodied in these hy- 
potheses were selected on the basis of a search 
of the literature for measures reported to dis- 
criminate creative from noncreative individu- 
als (Creative Education Foundation, 1958). 


Method 
Experimental Design 


The three hypotheses were tested by administering 
a battery of psychological comprised of 11 
measures to students taking the Creative Problem 
Solving courses in the School of Business Adminis- 
tration and to control groups of Ss taking other 
courses in the same school. The basic design of the 
experiment is depicted in Table 1 

The experimental group consisted of a total of 54 
students in three Creative Problem Solving courses 
Two were evening sections; the other was a day sec- 
tion. Since total pre—post testing time required four 
hours, it was not practicable to administer all tests 
to one control group. Two control groups were ac 
cordingly employed. Those measures of the battery 
which were considered to be tests of ability were ad 
ministered to Control Group A. Control Group B 
received those tests which were considered to be per- 
sonality measures. The one exception to this pro 
cedure was the Thematic Apperception Test (TAT) 
Originality ability measure which was included in 
the Control Group B battery because the total num- 
ber of ability tests was too great to be administered 
during one testing period 

Each experimental S was matched with an S from 
each of the two control groups on the basis of age, 
sex, and Wechsler Adult Intelligence Scale (WAIS) 
Vocabulary score (Wechsler, 1955). In order to in- 
crease the accuracy of matching, the initial number 
of control Ss tested was 200. Completion of the 
matching yielded a total of 54 Ss for the experi- 
mental group and 54 Ss for each of the two control 


tests 


189 





Arnold Meadow and Sidney J. Parnes 


Table 1 


Design of Experiment 


Pre-post Test Measures 


. Plot Titles Low (quantity) 
. Guilford Unusual Uses (quality) 
. Apparatus Test (quality) 
AC Test of Creative Ability 
. Plot Titles High (quality) 

. Thematic Apperception Test 


NO un & wh 


x 


. Thematic Apperception Test 


= 


. California Psychological Inventory 
10. 
11. 


California Psychological Inventory 
Wechsler Adult Intelligence Scale—Vocabulary* 


® Pretest only 


groups. The experimental and control groups were 
closely matched on the selected variables. Ages for 
the experimental group ranged from 17 to 51 years, 
for Control Group A, 17 to 50 years, and for Control 
Group B, 18 to 42 years. For the experimental and 
control A groups the average of the differences in 
age for the 54 matched pairs was 3.6 years; the av- 
erage of the differences in weighted WAIS Vocabu- 
lary score was .60. For the experimental and control 
B groups the average of the differences in age of the 
54 matched pairs was 3.8 years, and the average of 
differences in WAIS Vocabulary score was .68. 

Of the final experimental group sample, 42 were 
male; 12 were female. The final Control Group A 
and B samples each consisted of 48 male and 6 fe- 
male Ss. Of the 54 Experimental vs. Control Group 
4 matchings, 38 were of the same sex. The corre 
sponding number of same sex matching for Experi 
mental and Control Group B was 40. 

Tests were administered to all Ss as groups in their 
regular classes at the beginning and end of the se- 
mester. Three class sections were used to obtain the 
experimental Ss; ten sections were needed to attain 
the necessary number of control Ss. 


Experimental Instructions 


Each instructor introduced the experiment to his 
class by describing it as a university research project 
which would “not have anything to do with your 
grades.” The test administrator was then presented 
to the class. 


Instructions given at pretest session at beginning 
of semester. I think you will find interesting what 
you are asked to do. Sometimes the nature of 
the task may seem strange or silly. Nevertheless, 
please cooperate to the fullest extent inasmuch as 
everything you are asked to do is highly signifi- 
cant. 


AC Test of Creative Ability—Other Uses (quantity) 


Other Uses (quality) 


Originality (quality) 
Need Achievement 
Dominance Scale 
Self Control Scale 


Control 
Group B 


Control 
Group A 
(N = 54) 


Experimental 
Group 
(N = 54) 


AAKMKAAA 


AMAA MAKARKAKG 


for matching of experimental and control groups. 


Instructions read at posttest session at the end 
of semester. All of you are subjects in an experi- 
ment designed to measure changes which may have 
occurred in your thinking as a result of all your 
course work at the University this semester. 

During this period you will be given the post- 
test, consisting of a series of tests similar to the 
ones given the first time. 

Your instructor, Mr. , is interested in see- 
ing how well each one of you does. On the other 
hand, as explained before, the results of these tests 
will not go on your record, or have anything to 
do with your grades. It is a serious study which 
will provide some interesting scientific data, and 
we would appreciate your sincere cooperation. 

In the tests you will now take, you may use 
any answers which you may have used before 
and/or any new answers. The important point is 
to get as high a score as possible on the present 
test. 


Scoring 
All 


raters 


measures were scored by two independent 

Protocols were coded so that no rater was 
aware of whether he was rating the protocol of a 
control or an experimental subject. 

Pearson correlation coefficients between the scores 
of these raters were computed for all measures which 
required qualitative ratings Computations were 
based on a randomly selected sample of 50 Ss. 
Correlations ranged from .691 on the TAT Need 
Achievement to .993 on Guilford’s Unusual Uses. 

Guilford measures. The Guilford measures were 
scored in accordance with standard scoring instruc- 
tions provided by the author of the tests.? 


? The authors are indebted to J. P. Guilford and 
P. R. Merrifield for their assistance in providing the 
unpublished tests and scoring instructions, and to 
Robert F. Berner for statistical advice. 





Creative Problem Solving 191 


AC Test of Creative Ability (AC). Only one item 
from Part V of the AC Test was employed because 
of time limitation (listing all possible uses for a wire 
coat hanger). The scoring procedure for this test 
was modified to yield a quantity and quality score 
instead of a quantity and uniqueness score. Each 
response was scored as indicating either good or bad 
quality. The quality score was defined as compris- 
ing two dimensions: (a) uniqueness—degree to which 
the response departed from the hanger’s conventional 
use and (b) value—the degree to which the response 
was judged to have social, economic, aesthetic, or 
other usefulness. - 

The scorer was instructed to rate each response on 
a 1 2 3 scale for uniqueness and a 1 2 3 scale for 
value. The response was finally scored as indicating 
good quality if assigned a combined uniqueness and 
value score of at least 5. Final quality score used 
was the total number of “good quality” responses 
Any response which duplicated (in essential mean- 
ing) responses already given was eliminated from 
the scoring. 

Need achievement. This modification of the TAT 
test was scored according to directions published by 
McClelland, Atkinson, Clark, and Lowell (1953, pp 
107-138). The Originality measure was derived from 
story protocols obtained from the same four TAT 
type cards utilized for deriving the Need Achieve- 
ment measure. Previous studics employing the Origi- 
nality measure were based on a global appraisal by 
scorers (Barron, 1955). In the present investigation 
an attempt was made to introduce greater objectivity 
in scoring by adopting a detailed rating method. A 
four-step rating scale was utilized to define the Origi- 
nality dimension on each of an S’s four stories: (a) 
description or bare story—one point; (b) story with 
some elaboration of characters and/or plot—two 
points; (c) elaborate story—three points; (d) story 
indicating unusual amount of imaginative elabora- 
tion—four points. An S’s total originality score was 
the sum of the points for all four stories 

California Psychological Inventory. The CPI 
Dominance and Self Control Scales were scored ac- 
cording to standard instructions provided by Gough ® 
(1957). 


Sequence of Tests 


In designing the experiment, cognizance was taken 
of the effect the sequence of the tests might have on 
results. Test sequence was identical for the experi- 
mental group and Control Group A. The compari- 
son of the experimental group with Control] Group 
B, however, introduces an uncontrolled test sequence 
variable. On the one hand, the experimental group 
had taken the series of six ability tests prior to the 
administration of the three personality measures and 

3 We wish to thank Harrison Gough for providing 
the individual item keys for the two scales. We also 
wish to express acknowledgment to the Consulting 
Psychologists Press, Inc., Palo Alto, California, for 
permission to use the scales. 


the TAT Originality measure. On the other hand, 
Control Group B was administered the personality 
measures and the TAT Originality measure without 
prior administration of the series of ability tests 

The primary experimental interest was in testing 
the effects of the creative problem solving course on 
abilities. The decision was therefore made to place 
six of the seven ability tests before the personality 
tests, thus leaving the comparison of the ability tests 
of the experimental group with Control Group A un- 
contaminated by the test sequence effect. A priori 
considerations suggested, moreover, that the ability 
tests were less likely to influence personality meas 
ures than the converse arrangement. 


Results 


In order to control for possible differences 
in initial levels of performance, an analysis of 
covariance was employed for the evaluation of 
differences between experimental and control 
groups on all measures. Inspection of the 
data indicated that the regression was suffi- 
ciently linear to meet the assumptions of the 
covariance model. 

The calculation procedure employed is that 
described by Edwards (1951, pp. 341-348) 
for a two-variable analysis of covariance de- 
sign. 

Table 2 presents the comparison between 
the adjusted mean variances of experimental 
and control groups for the two measures of 
quantity of ideas. Inspection of the F ratios 
indicates both measures are significant beyond 
the 1% level. 

A similar comparison is depicted in Table 3 
for the five measures of quality of ideas. The 
results indicate that the AC Other Uses (qual- 
ity), and the Guilford Apparatus and Unusual 
Uses scores are significant beyond the 1% 
level. The Plot Titles High score just fails to 
reach the 5% level of significance. (Obtained 
F is 4.01; 4.02 is required for the 5% level.) 
The TAT Originality measure does not yield 
a significant difference. 

The comparison between experimental and 
control groups for the three personality meas: 
ures is presented in Table 4. The results in- 
dicate that the experimental as compared with 
the control group achieves a significant in- 
crease in Dominance. This comparison is sig- 
nificant at the 5% level. The results for the 
Need Achievement and Self Control variables 
indicate no significant differences. 





Arnold Meadow and Sidney J. Parnes 


Table 2 


Analysis of Covariance Between Pre~Post Differences of Matched Experimental and Control Groups 
Controlled for Initial Score Level—Quantity Creativity Measures 


Sum of Squares 
Errors of Est. df 


Test Source of Variation 


AC Other Uses 
Quantity 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 


Guilford Plot 
Titles Low 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 


Discussion 

The comparison between the Experimental 
and Control Group A indicated significant dif- 
ferences on both quantitative and qualitative 
measures of ability. On the two measures of 
idea quantity the experimental group attained 
a greater increase than the control group. 
This result suggests the conclusion that the 
creative problem solving students were utiliz- 


Mean 
Square 


1946.012 


1057.891 20.344 


888.121 888.121 43.655 


4382.5275 


3231.9973 62.154 


1150.5302 1150.5302 


ing the course methods, even though the tests 
gave no instructions to do so. 

Three of the quality measures (the AC 
Other Uses—Quality, and the Guilford Appa- 
ratus and Unusual Uses tests) yielded highly 
significant differences. In evaluating the re- 
sults indicated by the AC Other Uses and 
Guilford Unusual Uses scores, the specific na- 
iure of the training employed in the course 
must be considered. The students did receive 


Table 3 


Analysis of Covariance Between Pre—Post Differences of Matched Experimental and Control Groups 


Controlled for Initial Score Level 


Sum of Squares 
Errors of Est. 


Test Source of Variation 


AC Other Uses 
Quality 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 


Guilford 
Apparatus 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 


Guilford 
Unusual Uses 


Between groups plus error 
Residual within groups (error 


Adjusted Means 


Guilford Plot 
Titles High 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 
TAT 
Originality 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 


467.5089 


Quality Creativity Measures 


Mean 
Square 

819.5780 

352.0691 6.771 


467.5089 69.046 


2603.3446 


1466.6488 28.205 


1136.6958 1136.6958 40.301 


1432.6361 
795.3507 15.295 


637.2854 637.2854 41.666 


279.4798 


259.4284 4.989 


20.0514 20.0514 4.019 


114.5780 


111.2877 2.14 
3.2903 3.2903 





Creative Problem Solving 


Table 4 


Analysis of Covariance Between Pre—Post Differences of Matched Experimental and Control Groups 
Controlled for Initial Score Level—Personality Measures 


Sum of Squares 


Test Source of Variation 


TAT Need 
Achievement 


Between groups plus error 


Residual within groups (error) 
Adjusted Means 


CPI 


Dominance 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 


CPI 
Self Control 


Between groups plus error 
Residual within groups (error) 


Adjusted Means 


practice on the type of problem included on 
these tests. However, since the instructors 
carefully avoided practice on any objects even 
remotely similar to the type of objects which 
appeared on the tests, the results do indicate 
generalization of this training. Results of the 
Apparatus Test probably represent a greater 
degree of learning generalization inasmuch as 
problems designed to afford students practice 
in thinking of improvements for apparatus 
were deliberately excluded from training. 

Of the three personality measures, the CPI 
Dominance scale was the one measure which 
yielded a significant difference. This result 
indicated an increase in Dominance of the 
Experimental as compared with Control Group 
B (P< .05). This scale was devised by 
Gough “to assess factors of leadership ability, 
dominance, persistence, and social initiative. 
. . . High scorers tend to be seen as: Aggres- 
sive, confident, persistent, and planful; as be- 
ing persuasive and verbally fluent; as self- 
reliant and independent; and as having lead- 
ership potential and initiative. Low scorers 
tend to be seen as: Retiring, inhibited, com- 
monplace, indifferent, silent and unassuming; 
as being slow in thought and action; as avoid- 
ing of situations of tension and decision: and 
as lacking in self-confidence’’ (Gough, 1957, 
p. 12). 

It is interesting that Dominance was the 
one variable out of the three personality vari- 


Errors of Est. 


Mean 
Square 


408.5023 


401.8881 7.729 


6.6142 6.6142 


550.0154 


500.1160 9.618 


49.8994 49.8994 


1160.8034 
1148.0450 


12.7584 


ables which yielded a positive result. The 
personality type it represents is the very type 
which the methods of the course were ex- 
plicitly designed to encourage. 


Summary 

The experiment was designed to evaluate 
the effects of a creative problem-solving course 
on creative abilities and selected personality 
variables. Three hypotheses were tested: the 
method employed in the course would produce 
a significant increment (a) in quantity of 
ideas, (6) in quality of ideas, and (c) in 
the three personality variables—need achieve- 
ment, dominance, and self-control. 

A battery of 10 test measures was adminis- 
tered to matched experimental and control 
groups at the beginning and end of a creative 
problem solving course. The following re- 
sults were obtained: (a) The experimental as 
compared with the control group attained sig- 
nificant increments on the two measures of 
quantity of ideas; (4) the experimental as 
compared with the control group attained sig- 
nificant increments on three out of five meas- 
ures of quality of ideas; (c) the experimental 
as compared with the control group showed 
a significant increment on the California Psy- 
chological Inventory Dominance scale. 

Results are interpreted to indicate that the 
creative problem-solving course produces a 
significant increment on certain ability meas- 





194 


ures associated with practical creativity and 
on the personality variable dominance. 


Received September 3, 1958. 


References 


Barron, F. The disposition towards originality. J. 
abnorm. soc. Psychol., 1955, 51, 478-485. 

Creative Education Foundation. Compendium of 
research on creative imagination. Buffalo, N. Y 
Author, 1958. 

Edwards, A. L. Experimental design in psychologi- 
cal research. New York: Rinehart, 1951. 

Gerry, R., DeVeau, L., & Chorness, M. A review of 
some recent research in the field of creativity and 
the examination of an _ experimental 
workshop. Training Analysis and 
Div., Lackland AFB, Texas, 1957. 

Gough, H. C. Manual for the California Psycho 
logical Inventory. Palo Alto, Calif.: Consulting 
Psychologists Press, 1957. 

Lindzey, G. (Ed.) Handbook of social psychology 
Cambridge, Mass.: Addison-Wesley, 1954. 

McClelland, D. C., Atkinson, J. W., Clark, R. A., 
& Lowell, E. L. The achievement motive. New 
York: Appleton-Century-Crofts, 1953 


creativity 
Development 


Arnold Meadow and Sidney J. Parnes 


Osborn, A. F. Applied imagination 
Scribner’s, 1957. 

Parnes, S. J. Description of the University of 
Buffalo Creative Problem Solving Course. Crea- 
tive Education Office, Univer. of Buffalo, 1958 
(Mimeo.) 

Rapaport, D., Gill, M., & Schafer, R. 
psychological testing. Chicago: Chicago Yearbook 
Publishers, 1945-1946. 2 vols. 

Taylor, D. W., Berry, P. C., & Block, C. H. Does 
group participation when using brainstorming fa- 
cilitate or inhibit creative thinking. Dep. of In- 
dustrial Administration and Dep. of Psychol., Yale 
Univer., 1957. (Tech. Rep. No. 1, Contract Nonr 
609(20) NR 150-166.) 

Taylor, D. W., & McNemar, Olga W. Problem solv 
ing and thinking. Annu. Rev. Psychol., 1955, 6, 
455-482. 

True, G. H. Creativity as a function of idea fluency, 
practicability, and specific training 
Abstr., 1957, 17, 401-402 

Wechsler, D. Manual for the Wechsler Adult In 
telligence Scale. New York: Psychological Corp.. 
1955. 

Youtz, R. P. Psychological background of prin 
ciples and procedures in Alex F. Osborn’s textbook 
entitled Applied Imagination. Buffalo 
Educ. Found., 1955. (Mimeo.) 


New York: 


Diagnostic 


Dissertation 


Creative 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


INCREASING PROBABILITY OF TARGET DETECTION 
WITH A MIRROR-IMAGE DISPLAY’ 


C. H. BAKER anp G. E. BOYES 


Defence Research Medical Laboratories, Toronto, Canada 


In an earlier investigation (Baker, 1958) 
an analysis of the locations of targets de- 
tected in a radar-like task supported the hy- 
pothesis that Ss tend to scan back and forth 
behind the revolving radial sweep-line when 
searching for targets. Such a scanning tech- 
nic inevitably results in visual coverage of 
the extreme ends of the line (corresponding 
to minimum and maximum range on a PPI) 
which is half that devoted to mid-portions of 
the line, and, indeed, the analysis indicated 
that twice as many targets were detected be- 
hind the mid-portions of the revolving sweep- 
line as were detected in the regions behind 
the extremes. By redesigning the display in 
such a manner as to encourage a different 
method of search it was shown that visual at- 
tention could be biased towards one extreme 
of the sweep-line so as to increase the num- 
ber of peripheral detections. 

The present study was also concerned with 
increasing the probability of detection of 
targets appearing near locations representing 
maximum range, but with this difference: 
here we were concerned with maximizing the 
probability of detection of such targets by 
designing a radar-like display on which maxi- 
mum range was represented by center of the 
sweep-line.” 


Apparatus 


The basic apparatus simulated the type of 
radar display known as a B-scan, i.e., a 12- 
inch square display of ground glass with a 
vertical sweep-line which moved from left to 
right across the front of the display six times 
per minute, each sweep taking 10 seconds. 
Single targets * (bright spots of light one mm. 

1 This study constitutes Defence Research Medical 
Laboratories Report No. 107-6; PCC No. D77-94- 
20-23; HR No. 163. 

2 Note that target detectability is not improved by 
inverting the range dimension on a PPI so that dis- 
play center represents maximum range—see Hickson 
and Scott (1958) 

3 Targets were visually matched in brightness but 


195 


in diameter) could be “painted on” the dis- 
play at any one of 49 locations by the sweep- 
line and left on for any desired period of time 
(see Fig. 1). It will be noted from Fig. 1 
that, as for a conventional B-scan, range and 
azimuth are represented by the vertical and 
horizontal dimensions, respectively, with maxi- 
mum range represented by the top of the dis- 
play and minimum by the bottom. 

The.apparatus could be turned on its side 
to present the display shown in Fig. 2. Range 
was now represented by the horizontal dimen- 
sion, with minimum range represented by the 
left of the display, and maximum by the 
right. Such an arrangement encourages a 
lateral scanning motion. On the basis of con- 
clusions drawn in tracking studies (Fitts & 
Simon, 1952) one would anticipate superior 
scanning behavior in the lateral to that in the 
conventional vertical direction. 

A unique characteristic of the apparatus 
was that it could be “opened up” like a book 
so that the right half of the display was a 
lateral reversal of the left half (see Fig. 3). 





MAXIMUM 


RANGE ——= 


_ SWEEP LINE 














MINIMUM 


AZIMUTH 


Fic. 1. Conventional display. 


it was evident in some of the trials that matches were 
not perfect. 





196 


MINIMUM RANGE —-e MAXIMUM 








AZIMUTH 











Fic. 2. Horizontal display. 
Every target was now “painted” twice and 
maximum range was represented mid-way 
across the display. This arrangement has 
been termed the “mirror-image” display. 
Minimum range * could be represented by 
any vertical line on the display. If minimum 
were represented by the lateral boundaries 
AA shown in Fig. 3, the left half of the dis- 
play would be a duplicate of that in Fig. 2, 
while the right half would constitute a mirror- 
image. Under these conditions the display 
would be twice the area of that shown in 
Figs. 1 and 2. On the other hand, the mirror- 
image display could be equated in area with 
that in Figs. 1 and 2 by reducing the hori- 
zontal display dimension to one half its con- 
ventional value. Minimum range would now 
be represented by the dotted lines BB. Un- 
der this arrangement a target displayed at, 
say, % maximum range, would be physically 
located % of the distance from the dotted 


RANGE —- -—-——-_—--——_» 2__ -_—_ 


MINIMUM MAXIMUM MINIM UM 





— 





AZIMUTH 











| 
| 
| 
| 
| 
| 
1 
~ 


Fic. 3. Horizontal mirror-image display. 

*“Minimum” and “maximum” are not to be taken 
literally. Targets appearing at these represented 
ranges were from ¢ to 4 inch from display bounda- 
ries. ‘Half’ range means half the physica! distance 
between the “minimum” and “maximum” bounda- 
ries. 


C. H. Baker and G. E. Boyes 


lines to maximum, and so would not be at the 
same physical location on the basic display as 
in the case when minimum AA was employed. 

The mirror-image display could be placed 
on end, as shown in Fig. 4. Figures 1 to 4 
represent the four experimental conditions 
which were compared. 


Experiment 1 (Double Area Mirror-Image) 
Procedure 

Subjects sat alone in a semi-darkened booth facing 
the display which was tilted 40 deg. back from the 
horizontal. Viewing distance was about 16 in. The 
task was to press a button whenever a target was 
detected. They were told that targets might be 
“painted on” anywhere on the display by the sweep- 
line, and sample targets were shown. Targets, which 
persisted for one second, appeared at only 13 of the 
49 locations available, three at each of minimum, 
half, and maximum range,* plus one randomly chosen 
from the remaining locations in each quarter. The 
same 13 locations were used in all four conditions, 
though, of course, when the sweep-line was vertical 


MINIMUM AZIMUTH 





RANGE ———_ S§ => RANGE 














MINIMUM 


Fic. 4. Vertical mirror-image display. 





Increasing Probability of Target Detection 


Table 1 


Showing the Number of Targets Missed Out of 144 Presented, at Each of the Three Represented 


Conventional 
Display 
(Fig. 1) 


Range 
Represented 


Minimum 106 
Half 39 
Maximum 91 


Total 236 


Note Mirror-image displays 
they represented different ranges and azimuths than 
when it was horizontal. Each target in any trial 
was exposed eight times, and 8 X 13, or 104 targets 
were presented in a different random order for each 
S and condition. Intervals between targets randomly 
ranged between 12 and 38 sec., the mean intertarget 
interval being 20 sec. A trial lasted 35 min. In this 
experiment minimum range was represented by AA 
(Figs. 3 and 4). Thus, the mirror-image displays 
were twice the area of the other two. 

Subjects were six laboratory personnel. Each S 
was exposed to each of the four conditions in a dif- 
ferent order. 


Results 


In Table 1 is shown the number of targets 
which escaped detection under each display 
condition at the nine locations representing 
minimum, half, and maximum range. From 
Table 1 it is apparent that in three of the 
four conditions more targets were detected in 
locations representing half range than in those 
representing minimum or maximum range. 
The exception was the horizontal mirror- 
image condition in which progressively fewer 
targets escaped detection as represented range 
increased. 

The 48 the horizontal mirror- 
image condition at maximum range were al- 
most completely due to one target which was 
missed 44 times, the other two at this range 
being missed four and zero times. This same 
target is responsible for an enlarged number 
of misses in the horizontal condition, the to- 
tal of 67 being composed of 15, 9, and 43. In 
location this target was at bottom center in 
the former condition and bottom right in the 
latter. Under the other two conditions this 
target was differently located and was not 
missed a disproportionate number of times, 
indicating a marked positional factor. 


misses in 


Horizontal 
Display 
(Fig. 2) 


Ranges for Each of Four Display Conditions 


Horizontal 
Mirror-Image 
(Fig. 3) 


Vertical 
Mirror-Image 
(Fig. 4) 


84 106 112 
34 55 59 
67 48 83 


185 209 254 


were double the area of the other two. 


The percentage of times each of the 13 
targets was missed was subjected to an arc sin 
transformation and an analysis of variance 
was done. The analysis indicated significant 
differences between displays. However, dif- 
ferences between target locations and subject 
interactions implied some subject inconsist- 
ency possibly due to the inadequacy of bright- 
ness matches noted above. 

In summary it can be stated that this ex- 
periment has demonstrated that the horizontal 
mirror-image display results in an increased 
probability of detection of targets at locations 
representing maximum range. 


Experiment 2 (Mirror-Image Display 
Equated for Area) 


In Experiment 2 the mirror-image displays (Figs 
3 and 4) were equated for area with the remaining 
two (Figs. 1 and 2) by masking the two end quar- 
ters of the displays so that minimum range was rep 
resented by BB, i.e., in these displays range was rep- 
resented by half the display dimension employed in 


Table 2 


Analysis of Variance of Transformed Data 


df SS MS 

Subjects (S) 18.4198 
57.3637 
4.7609 
28.3389 
17.0507 
SC 21 6.8451 
SCT 34.1891 


2.6314 
5.2149 
1.5870 
0.8588* 
0.2214 
0.3260 
0.1480 


Targets (T) 
Conditions (C) 
CT 

ST 


Total 
Theoretical Residual 


166.9682 
0.1000 


* Significant at .01 level 





C. H. Baker and G. E. Boyes 


Table 3 


Showing the Number of Targets Missed Out of 240 Presented at Each of the Four Represented Ranges, 
Were Equal in Area 


Conventional 
Display 
(Fig. 1) 


Range 
Represented 


Minimum Range 
1/3 Range 
2/3 Range 
Maximum Range 


Total Missed 


Percentage Missed 


Experiment 1. Only 12 targets were employed, three 
at each of minimum, 3, 4, and maximum range. 
Each target in any trial was exposed 10 times, the 
10 X 12, or 120 targets being randomly presented in 
a different order for each S and condition. Targets 
persisted for 0.6 second. Subjects were eight females 
from outside the laboratory who were paid. 


Results 


The percentages of target detections were 
transformed to radians by the arc sin trans- 
formation and an analysis of variance of the 
transformed values is shown in Table 2. From 
Table 2 it is apparent that the interaction of 
conditions and targets is significant, reflecting 
the fact that the locations at which targets 
were missed depended on the display. 

In Table 3 are shown the number of tar- 
gets which escaped detection under each con- 
dition at each of the four locations represent- 
ing minimum, 4, %, and maximum range. 
From Table 3 it is again apparent that under 
all conditions the greatest attention was de- 
voted to locations representing medium range. 
With respect to targets at locations represent- 
ing maximum range, most were missed under 
the conventional condition (95), and fewest 
were missed under the horizontal mirror-image 
condition (31). Again, the target missed 
most frequently (21 out of 31) was at the 
bottom center. This same target, at bottom 
right, was responsible for the 50 of the 87 
misses in the horizontal, and was not missed 
a disproportionate number of times on the 
other two displays. It is apparent that the 
horizontal mirror-image display was superior 
to all others at % range also. In summary it 
is clear that the horizontal mirror display was 


for Each of Four Display Conditions. All Displays 


Vertical 
Mirror-Image 
(Fig. 4) 


Horizontal 
Mirror-Image 
(Fig. 3) 


Horizontal 
Display 
(Fig. 2) 


82 114 
17 45 

2 30 
31 64 


253 


superior on two counts, (a) on the total num- 
ber of targets detected, and (b) on the greater 
number of targets detected in regions ap- 
proaching maximum range. 


Discussion and Conclusion 


The study has demonstrated that displays 
can be designed to capitalize on the fact that 
some portions of displays are given more 
visual coverage than others. By designing a 
display in such a manner that brief events of 
greatest importance occur in the center of the 
area being searched, the probability of such 
events being detected is greater than if they 
occur in relatively peripheral regions. This 
principle appears to hold particularly in situa- 
tions where lateral eye movements are in- 
volved. Vertical eye movements, where the 
distance scanned is sufficient to require head 
movements too, was not found to result in 
improvement in the probability of detection 
of centrally located events. 


Received September 4, 1958. 


References 


Baker, C. H. Attention to visual displays during a 
vigilance task. I. Biasing attention. Brit. J. Psy- 
chol., 1958, 49, 279-288. 

Fitts, P. M., & Simon, C. W. The arrangement of 
instruments, the distance between instruments, and 
the position of instrument pointers as determinants 
of performance in an eye-hand coordination task. 
USAF WADC tech. Rep., 1952, No. 5832. 

Hickson, R. H., & Scott, D. M. Detectability on 
cathode ray tube screens: Comparison of PPI, in- 
verted PPI, and B-scan with noise and noise- 
free conditions. Defence Res. Board of Canada, 
DRML Rep., 1958, No. 163-15. 





Journal of Applie 


d Psychology 
Vol. 43, No. 3, 1959 


A FACTORIAL STUDY OF DEXTERITY TESTS’ 


G. LEE BOURASSA 


Allis Chalmers Manufacturing Company 


AND ROBERT M. GUION 


Bowling Green State University 


The advent of the transistor and other in- 
dustrial operations involving very small items 
brings a need for a better understanding of 
very fine manipulative work. Factor analytic 
research has suggested that hand dexterity 
and finger dexterity may be separate abilities, 
distinguishable in terms of the fineness of the 
work performed (Dvorak, 1947; Fleishman, 
1953; Fleishman & Hemple, 1954a; French, 
1951; Hemple & Fleishman, 1955). 

In developing a test to be used for the se- 
lection of transistor assemblers, the junior au- 
thor made several observations pertinent to 
the present study: (a) the muscle movements 
involved in performance of this job were brief 
and short, total span being measurable in 
very small fractions of inches; (4) even with 
parts large enough to hold in the fingers, all 
manipulations had to be made with tweezers 
to avoid contamination of material from finger 
oils; (c) tests of visual skills were significant 
predictors of performance; and (d) there was 
a significant correlation between the tweezer 
dexterity test developed and the depth per- 
ception test of the Bausch and Lomb Ortho- 
rater. 

The present study is built around two hy- 
potheses stemming from these observations: 
(a) there is a factor of psychomotor skill that 
can be called tweezer dexterity, the ability to 
make rapid and controlled manipulations with 
tweezers, that is different from previously 
identified factors of manual or finger dex- 
terities, and (b) the relationship between 
visual factors and psychomotor factors is 
oblique rather than orthogonal. 

Factor analytic literature offers very little 
information about either of these hypotheses. 


1This is a report of research done by Bourassa 
under the supervision of Guion in partial fulfillment 
of the requirements of the degree of Master of Arts 


at Bowling Green State University. The original 
thesis from which this article is taken is deposited 
in the BGSU library. 


199 


Dexterity tests requiring tweezers have been 
included in psychomotor batteries for factor 
analysis, but not in large enough numbers to 
identify them as defining a distinct factor. 
Moreover, visual tests have not been included 
in psychomotor batteries. Factor analyses of 
visual skills are not common, but the analysis 
of Orthorater scores by Zachert (1951) does 
suggest that depth perception is distinct from 
simple visual acuity. 

Consideration of the literature in this do- 
main suggests one methodological weakness or 
flaw in previous research which should be 
avoided. It appears that previous investiga- 
tors have used the same order of testing for 
all Ss in a given study. Although Fleishman 
and Hemple (1954) have pointed out the ef- 
fect of practice on factor structure of a task, 
and although taking one psychomotor test 
may be considered practice for another, only 
a study by Fleishman (1953) reports any at- 
tempt to vary the administration order of test 
variables, and this was merely a reversal of 
order. 


The Test Battery 


Tests were selected or constructed on the 
assumption of five factors: (a) manual dex- 
terity, previously identified, the ability to 
make skillful, controlled arm and hand ma- 
nipulations at a rapid rate; (5) finger dex- 
terity, previously identified, the ability to 
make skillful, controlled manipulations with 
the fingers at a rapid rate; (c) tweezer dex- 
terity, hypothesized, the ability to make skill- 
ful, controlled manipulations with tweezers at 
a rapid rate; (d) tentatively identified, visual 
acuity, the ability to perceive fine visual 
stimuli; and (e) tentatively identified, depth 
perception, the ability to perceive differences 
in distances of stimuli. 

The following list identifies the tests used 
in terms of the factors each was assumed to 





200 


identify. Reference tests used in previous 
studies will be identified by an (R) after the 
name of the test; tests identified by a (C) 
after the name of the test were constructed 
specifically for this study.” 


Manual Dexterity 


1. Minnesota Rate of Manipulation—Turn- 
ing (R). The score is the number of blocks 
turned in two 35-second trials.® 

2. Minnesota Rate of Manipulation—Plac- 
ing (R). The score is the number of blocks 
placed in two 40-second trials. 

3. Dowel Manipulation (C). This test 
consists of a 6” X 18” board with 32 holes 
(four rows of eight) ;% inch in diameter. 
When a 14” length of 4” dowel is inserted into 
a hole, half of it protrudes. Dowels are in 
each hole, and the S removes a dowel with 
one hand and returns it to the hole, reversed, 
with the other hand. The task differs from 
the first test only in the size of the apparatus. 
Since it involves smaller pieces than the Min- 
nesota test, it was assumed that the loading 
for manual dexterity would be lower, although 
still significant since arm movement is in- 
volved. A finger dexterity loading was also 
anticipated. The score is the number of 
dowels inverted in two 25-second trials. 


Finger Dexterity 

4. Purdue Pegboard—Nonpreferred Hand 
(R). The score is the number of pegs placed 
in two 30-second trials. 

5. Purdue Pegboard—Both Hands (R). 
The score is the number of pins placed in 
two 30-second trials. 

6. O’Connor Finger Dexterity (R). The 
score is the number of pins placed in two 2- 
minute trials. 

7. Placing, The test 
contains 32 small washers (four 
eight), inside diameters about 5”, into which 
the S places small pellets of shot (calibre 118 


Finger (C). board 


rows of 


2Complete descriptions and photographs of the 
tests developed for this study are included in the 
origina! thesis. 

8 Testing times are decided upon on the basis of 
pilot runs with varying numbers of Ss. It was de- 
sired to use the smallest time possible consistent with 
the need for reasonable reliability, in order to avoid 
over-fatigue of subjects. 


G. L. Bourassa and R. M. Guion 


B-B’s). From a tray at the top of the board, 
the S takes a pellet and places’ it in the hole 
of the washer. The score is the number of 
pellets placed in two 30-second trials. 


Tweezer Dexterity 


8. O’Connor Tweezer Dexterity Test. The 
score is the number of pins placed, by 
tweezers, in two 90-second trials. 

9. Pin Moving (C). The test board con- 
tains two round trays, placed six inches apart, 
containing common pins with the heads re- 
moved. With tweezers the S moves pins one 
at a time from a full tray to any empty one. 
The score is the number of pins moved in two 
60-second trials. 

10. Bowling Green Tweezer Dexterity Test. 
This test was developed specifically for selec- 
tion of transistor assemblers; it consists of 
placing small plastic discs (diameter ;';”) 
into holes in a brass plate with the help of 
tweezers. There are eight rows of 12 holes 
in the plate; a small bowl at the top of the 
test board contains approximately 200 discs. 
The score is the number of discs placed in 
two 60-second trials. 

11. Placing, Tweezer (C). This test uti- 
lizes the same apparatus as Test 7, except that 
the pellets are to be handled with tweezers 
rather than fingers. The score is the number 
of pellets placed in two 30-second trials. 


Depth Perception 


12. Orthorater Depth Perception (R). The 
score is the number of correct successive re- 
sponses on two trials. 

13. Depth Perception (C). Eight rows of 
four white discs (one inch in diameter) are 
mounted on dowels projecting approximately 
two inches from a black background. In each 
row, one disc extends further (i.e., is closer to 
the S) than the others. This distance differ- 
ence is }” in the first row, and decreases by 
decrements of ;';” in the following rows until 
there is a difference of only 3s” in the eighth 
row. The S is to identify the displaced discs 
(primarily from binocular disparity, since 
light and shadow cues are reduced to a mini- 
mum) from a distance of eight feet. The 
score is the number of correct successive re- 
sponses on two trials. 





Factorial Study of Dexterity Tests 


Visual Acuity 


14. Orthorater, Near Acuity—Left Eye (R). 
The score is the number of correct successive 
responses on two trials. 

15. Orthorater, Near Acuity—Right Eye 
(R). The score is the number of correct suc- 
cessive responses on two trials. 


Testing Procedure 


Subjects were 100 women volunteers, enrolled in 
undergraduate psychology classes. Ages ranged from 
18 to 23. Three Ss with monocular vision were re- 
jected but were replaced. All testing was done by 
the senior author 

Two adjoining rooms were used. They were win- 
dowless, each 10’ X 11’, painted black; light was well 
controlled. Motor tasks were given in one room, 
visual tasks in the other. Tests requiring the S to 
be seated were given at an ordinary office desk; 
those requiring a standing position were given at a 
table. 

Using a table of random numbers, 10 orders of 
presentation were developed for the psychomotor 
tests and 10 for the visual tasks. For each room a 
deck of 100 cards were prepared, listing each se- 
quence 10 times, and shuffled. The sequence for each 
S was given on the card on top when she entered 
the test room. Half of the Ss started with vision 
tests; half with dexterity. If a S started with the 
visual tests, retesting was done after all motor tests 
were complete; if the vision tests were last, all four 
tests were given once, with retesting following in the 
same order. 

Pearson product moment correlations were com- 
puted for the matrix, using the sum of the two ad- 
ministrations of each test as the raw score. Reli- 
ability estimates were computed by correlating the 
two independent administraions and then correcting 
by applying the Spearman-Brown prophecy formula. 

The procedure used for the analysis was Thur- 
stone’s centroid method, as described by Fruchter 
(1954). Three criteria were employed to indicate 
sufficient factor extraction, in accordance with Cat- 
tell’s (1952) dictum of erring on the side of extract- 
ing too many factors rather than too few. These 
criteria were Tucker’s Phi, Humphrey’s Rule, and 
Coomb’s Criterion. Five factors were extracted and 
rotated to approximate simple structure and positive 
manifold. The data (perhaps unaware of the writers’ 
intention to deal with oblique factors!) required or- 
thogonal rotation, which was done graphically. Nine 
rotations were needed to approximate criterion. 


Results 


The original centroid loadings are shown in 
Table 1, with the rotated loadings in Table 2. 
Of the five factors, one appears to be merely 
a residual, one a triplet; three would justify 
identification. In this discussion, test vari- 


201 


ables with significant loadings (.30) on each 
factor will be presented; test variables that 
have their highest loading on the factor will 
be identified by an asterisk (*). 

Factor I is identified as Manual Dexterity 
and has significant loadings on the following 


Test Loading 
Minnesota Rate of Manipulation, Pw 
Placing 
Purdue 
Hand 
Purdue Pegboard, Both Hands 63* 
Minnesota Rate of Manipulation, 61* 
Turning 
Dowel Manipulation 60" 
O’Connor Finger Dexterity 50* 
Bowling Green Tweezer Dexterity 37 
11 Placing, Tweezer 37 
9 Pin Moving 32 


Pegboard, Nonpreferred 68* 


The identification of this factor is based 
primarily on Tests 2, 4, 5, 1, and 3. The 
loadings on these tests are more than ade- 
quate, and this factor accounts for the major 
portion of the explained variance on each of 
these tests. The visual tests have loadings of 
near zero on this factor, indicating rather 
clearly that visual skill is not involved. 

On each of the first five tests, a major char- 
acteristic is the movement of arm and hand. 
The other tests may also involve some fairly 
gross arm movement, although it is not so 
marked. Tests 10, 11, and 9, of course, have 
at least one other equal or higher loading on 
another factor. 

Factor II, Visual Sensitivity, is 
by the four vision tests: 


identified 


Test 

No. Test 
15 Orthorater, Near Acuity —Right Eye 
14 Orthorater, Near Acuity—Left Eye 
12 Orthorater, Depth. Perception 
13 Depth Perception 


Loading 


Other studies have reported more refined 
visual factors; Zachert (1951) reported the 
two acuity tests as having high loadings on a 
factor labeled “Acuity,” but included the Or- 
thorater Depth Perception Test in her re- 
sults as a specific factor having no significant 
loading on Acuity. Rabideau (1955) identi- 
fied two more kinds of acuity factors. In this 





G. L. Bourassa and R. M. Guion 


Table 1 


Centroid Factor Loadings 


Variable 
. Minnesota, Turning 
. Minnesota, Placing 
. Dowel Manipulation 
. Purdue Pegboard (N) 
. Purdue Pegboard (B) 
. O’Connor Finger 
. Placing, Finger 
. O’Connor, Tweezer 
. Pin Moving 
. Bowling Green Tweezer 
. Placing, Tweezer 
. Depth, Orthorater 
. Depth Perception 
. Near Acuity, Left Eye 
. Near Acuity, Right Eye 


NO Uke WN = 


study, however, no such distinction could be 
made; accordingly we have chosen an identi- 
fying label (visual sensitivity) that is de- 
liberately inclusive. 

Factor III had only one test (O’Connor 
Tweezer Dexterity) with a loading greater 
than .30. It is considered a residual. 


Factor IV is a triplet which we have not 
identified: 


Test 

No. Test Loading 
9 Pin Moving 41* 
11 Placing, Tweezer 37 
3 Dowel Manipulation 33 


It might be suggested that this factor could 
involve a grasping ability. This implies that 
holding the tweezers involves a kind of “con- 
trolled grasping’—a lack of fumbling. The 
same implication can also apply to the Dowel 
Manipulation Test; the finger movements in- 
volved are markedly similar. Anything more 
than a mild suggestion, however, is open to 
serious question, since several similar tests 
failed to show loadings even approaching rea- 
sonable significance on this factor, yet would 
also appear, introspectively, to involve the 
same type of muscle control. Any naming of 
this factor would be too tentative to be 
justified. 

Factor V is tentatively defined as Visual 
Feedback; there appears to be no precedent 


07 
16 
07 
— 44 
—.42 
— .OV 
— 57 


for it in factor analytic literature. The fol- 
lowing tests have significant loadings on this 
factor: 


Test 


No. Test 
10 Bowling Green Tweezer Dexterity BS 
6 O’Connor Finger Dexterity A3 
8 O’Connor Tweezer Dexterity A2* 
Orthorater, Near Acuity—Left 40 

Eye 
Orthorater, Near Acuity 
Eye 
Depth Perception 34 
Dowel Manipulation 33 
Pin Moving 31 


Loading 


Right 36 


The interpretation of this factor defines 
Visual Feedback as the ability to use fine 
visual cues in the manipulation and placing 
of small objects. The dexterity tests with 
high loadings on this factor require the plac- 
ing of small pins or discs in holes. It ap- 
pears that more than skillful manipulation 
with tweezers or fingers is necessary in this 
operation; it is also necessary to interpret 
correctly the sensory cues received while per- 
forming the task and to adjust performance 
accordingly. The high loadings of three vi- 
sion tests on this factor indicate that a visual 
ability is involved. The nature of this visual 
ability may be related to the positioning of 
pins or discs before the actual placing of them 
in the holes. Apparently some visual ability 





Factorial Study of Dexterity Tests 


Table 2 


Rotated Factor Loadings 


Variable Ill IV V 


. Minnesota, Turning — .06 27 14 
. Minnesota, Placing — .06 17 18 
Dowel Manipulation . d 14 33 33 
. Purdue Pegboard (N) .68 : 13 .03 
. Purdue Pegboard (B) : i j 17 me 
. O’Connor Finger s .09 14 43 
. Placing, Finger ; : i .25 
. O’Connor, Tweezer » > 34 17 42 
. Pin Moving a 10 00 Al 31 
. Bowling Green Tweezer a 04 ‘ 00 53 
. Placing, Tweezer a .03 .28 37 AS 38 
. Depth, Orthorater : 57 12 04 08 38 
. Depth Perception : mR — .09 05 34 Al 
. Near Acuity, Left Eye —.19 62 — .08 14 40 61 
15. Near Acuity, Right Eye — .08 65 .00 16 36 58 


sna wn & wr 


* Reliability estimates are first-half, last-half correlations, corrected for full length by the Spearman-Brown formula 


is required for adequate performance because 
of the extreme smallness of the objects being 
manipulated. 

Other sensory cues are probably involved 
that could act in conjunction with vision; per- 
haps in cases of visual impairment these could 
be substituted for visual cues. If other sen- 
sory cues are involved, then this factor could 
be designated by the more general term “Sen- 
sory Feedback.’”’ The more restricted term 
seems preferable, however. For example, the 
Minnesota and Purdue test series had very 
low loadings on this factor, These are rela- 
tively gross tests; they do not seem to re- 
quire such fine visual discrimination. It 
seems reasonable to suggest that any sensory 
feedback involved in these tests would in- 
volve the skin senses more than vision. 

It is also possible that this factor may be 
identified as eye-hand coordination—a desig- 
nation conspicuous for its absence from the 
factor analytic literature. Such a designa- 
tion is not justified, however, since the visual 
tests of this battery involve no motor ac- 
tivity. Fleishman (1953, 1954) identified a 
factor which is based on eye-hand coordina- 
tion and which he called Aiming Ability. This 
factor he found only with paper and pencil 
tests, but a companion factor called Position- 
ing was found with apparatus tests. This 
title might be appropriate for the present fac- 


tor. The use of the same title might, how- 
ever, imply a matching of the two factors, 
and such an implication would be most un- 
justified, since Fleishman’s Positioning factor 
had a high loading on the Minnesota Placing 
Test. Clearly, the naming of this factor is 
tentative, but further research seems _indi- 
cated. 


Discussion 


The hypotheses upon which the present re- 
search was based have not been supported, 
but important questions for additional study 
have been raised. Apparently, additional ex- 
plained variance in psychomotor skills will 
not be found by the simple expedient of using 
finer movements or extra tools. Not only 
did the hypothesized tweezer dexterity factor 
fail to appear, but also the finger dexterity 
factor, accepted as previously established, was 
not found. Reference tests for such a factor 
were included here among the tests identify- 
ing the manual dexterity factor. An example 
of such a shift is the Purdue Pegboard. These 
tests appear to involve as a predominant fea- 
ture an increasingly rapid and accurate move- 
ment of the entire arm; finer finger and wrist 
movements are involved, but to perhaps a 
lesser degree. 

Apparently it can be concluded that, at 
least for standard tests of dexterity, even 





204 


the finest tweezer or finger test will have a 
significant amount of its variance attributed 
to the ability to make rapid arm movements. 
If this is true, then factor analytic studies 
designed to test the hypothesis of a separate 
factor for these more restricted movements 
would need to provide tasks in which gross 
arm movements are either eliminated or 
greatly reduced. 

Another possible explanation of the failure 
to identify a finger dexterity factor may be 
methodological. Generally, previous research 
seems to have used a standardized order of 
testing. In such a case, it appears possible 
that many correlations obtained might be 
artifacts of practice or fatigue. Moreover, 
test variables requiring similar responses can 
provide a set for succeeding tests; in this 
manner the practice effect could be consider- 
able. In other words, it can be suggested as 
a hypothesis for research that order of test 
administration may influence the factorial 
structure of a test battery. If this hypothe- 
sis is tenable, then previously accepted fac- 
tors of psychomotor ability might warrant 
re-evaluation. 

Factor V, tentatively named “Visual Feed- 
back,” accounts for much of the variance for 
the finer, more restricted tests that was not 
accounted for by the manual dexterity fac- 
tor. Perhaps previous studies have found this 
factor but have not identified it as such be- 
cause of the lack of visual tests in the bat- 
teries. It seems possible that such feedback 
may be the basis for the second dexterity 
factor found in many studies. It seems plau- 
sible also that other sensory modalities may 
contribute enough feedback to account for 
still more of the total variance. The consid- 
eration of this possibility raises the important 
question of whether much of the presently 
unexplained variance in psychomotor tasks 
could be accounted for by relationships that 
exist between factors in divergent domains. 
Future research that investigates possible re- 
lationships between different factor domains 
may find additional factors, such as the 
Visual Feedback factor that apparently does 
not belong in any one domain but appears to 
lie along the border of previously distinct 
domains. 


G. L. Bourassa and R. M. Guion 


Summary and Conclusions 


This study has been concerned with the 
factor analysis of a battery of dexterity tests 
and vision tests in an effort (a) to identify 
a “tweezer dexterity” factor and (5) to de- 
termine the relationship between fine dexteri- 
ties and visual skills—particularly depth per- 
ception. The centroid method was used and 
the five factors extracted were rotated or- 
thogonally. Three of the factors extracted 
were identified: (a) manual dexterity, the 
ability to make skillful and rapid arm-hand 
movements, (b) visual sensitivity, the ability 
to make fine visual discriminations involving 
acuity and/or depth perception, and (c) 
visual feedback, the ability to use fine visual 
cues in the manipulation and placing of small 
objects. The identified factors include neither 
the tweezer dexterity factor hypothesized, nor 
the separate finger dexterity factor found in 
previous studies with the same tests. 


Received September 10, 1958. 


References 


Cattell, R. B. Factor analysis. 
1952. 

Dvorak, Beatrice. The new USES general aptitude 
test battery. J. appl. Psychol., 1947, 30, 372-376. 

Fleishman, E. A. Testing for psychomotor abilities 
by means of apparatus tests. Psychol. Bull., 1953, 
50, 241-262. 

Fleishman, E. A. A factorial study of psychomotor 
abilities. USAF Personnel Train. Res. Cent., Lack- 
land AFB, San Antonio, Texas, May 1954. 

Fleishman, E. A., & Hemple, W. E., Jr. Changes in 
factor structure of a complex psychomotor test as 
a function of practice. Psychometrika, 1954, 19, 
239-251. (a) 

Fleishman, E. A., & Hemple, W. E., Jr. Factorial 
analysis of complex psychomotor performance. 
USAF Personnel Train. Res. Cent., Lackland AFB, 
San Antonio, Texas, April 1954, 54-12. (b) 

French, J. W. The description of aptitude and 
achievement tests in terms of rotated factors 
Psychometr. Monogr., 1951, No. 5 

Fruchter, B. Introduction to factor analysis. 
York: Van Nostrand, 1954 

Hemple, W. E., & Fleishman, E. A. A factor analy 
sis of physical proficiency and manipulative skill. 
J. appl. Psychol., 1955, 39, 17-19. 

Rabideau, G. F. Differences in visual acuity meas- 
urements obtained with different types of targets. 
Psychol. Monogr., 1955, 69, No. 10 (Whole No. 
395). 

Zachert, Virginia. A factor analysis of vision tests. 
Amer. J. Optom. Arch. Amer. Acad. Optom., 1951, 
38, 405-416. 


New York: Harper, 


New 





Journal of Applie 


d Psychology 
Vol. 43, No. 3, 1959 


INTERACTIONS BETWEEN DISPLAY GAIN AND 
TASK-INDUCED STRESS IN MANUAL 
TRACKING * 


W. D. GARVEY anp JEAN B. HENSON 


Naval Research Laboratory 


It is important for the human engineer to 
know how the performance of man-machine 
systems is affected by factors which stress or 
overload the human operator. In an earlier 
study (Garvey, 1957) it was found that the 
order of merit of different tracking systems, 
produced by varying the nature of the con- 
trol dynamics, remained unchanged when the 
operator was subjected to various forms of 
“task-induced stress.” The present study ex- 
tends the investigation to systems, all of 
which have the same control dynamics, but 
which differ in display magnification. 


Apparatus and Procedure 


A compensatory position control tracking device 
was employed through the experiment. A _ flow 
chart of the tracking device is shown in Fig. 1. The 
Ss’ task was to attempt to hold a dot on the center- 
line of a 5-in. CRT by manipulating a spring-re- 
strained control stick with the right hand. The tar 
get dot was continuously forced off center along the 
horizontal by a complex sine wave generator which 
furnished frequencies of 11, 7, and 3 cycles per min., 
the amplitudes being inversely proportional to the 
frequency. Any one of five display gain settings 
could be selected through the positioning of a switch 
The gains and associated maximum target excursions 
are shown in Table 1. The system was so adjusted 
that with an arbitrary gain setting of 0.15, 10 ounces 
of force applied to the joy stick moved the dot 1 in. 
Increased magnifications were achieved by increasing 
the gain in steps up to 1.00, the highest magnifica- 
tion employed. 

Two types of performance measures were taken, 
system error and display error. As may be seen in 
Fig. 1, system error is a measure of the extent to 
which the output of a tracking system follows the 
input. It is obtained by electronically integrating 
the continuous difference in voltage between the sys 
tem input and system output without regard to sign 
Display error, the system error multiplied by the se- 
lected gain constant, is the input signal to S and is, 
in effect, a measure of the error seen by S. Although 
the error at the display could always be computed 


1 The authors are greatly indebted to F. V. Taylor 
for his assistance in the interpretation of the results 


and for his comments in the preparation of the‘ 


manuscript. 


205 


from the system score, a separate electronic integra- 
tor was also used in the study to furnish a direct 
measure of display error. 

Each tracking trial was 1 min. in duration and 
performance during only the last 55 sec. of the trial 
was scored. The trackers were five Naval enlisted 
men. The general experimental design consisted of 
five Ss, five display gain settings, and five 1-min 
trials. The design was permuted from session to 
session so that each S had four trials per day with 
each magnification. At the end of the seventh ses-' 
sion (ie., after 28 practice trials with each gain) 
Ss were required to operate under a series of con- 
ditions intended to degrade performance. In accord- 
ance with previous practice (Lazarus, Deese, & Osler, 
1952) these will be referred to as types of task-in- 
duced stress. 

Each stress condition was presented on a separate 
day with a different counterbalancing of display mag- 
nifications and Ss. The Ss were always given one 
trial without stress on the particular magnification 
level they were to experience next under stress. A 
careful explanation was then made as to how each 
stress mode of operation differed from that of the 
training sessions. The stress trials consisted of one 
1-min. trial with each gain setting under each of the 
following conditions: 2 
The 
to solve two-digit mental subtraction problems at a 
rapid pace, while tracking 

Incompatible display-control relation. The Ss per- 
formed the tracking task with a display-contro] re 


Secondary arithmetic task Ss were required 


lable 1 
Magnification and Maximum Target Excursion 
for Each Display Gain 


Maximum 
Target 
Excursion 


Display 
Gain Magnification 

0.15 

0.25 

0.32 

0.50 

1.00 


2.0 in. 
3.4 in 
4.2 in 
6.7 in 
13.4 in. 


2For a more complete description of these stress 
conditions, see Garvey (1957). 





W. D. Garvey and Jean B. Henson 


SYSTEM 
OUTPUT 








Fic. 1. 


lationship which was the reverse of that on which 
they were trained. 

Two-hand tracking. The Ss were required to track 
two dots simultaneously, one with a right-hand con- 
trol and one with a left-hand control. The left-hand 
system was identical with the right-hand system; the 
two dots followed the same course input, which was 
that employed during the training sessions. Only the 
performance of the right-hand system was used in 
analyzing the results. 

Two-coordinate tracking. In this condition the 
dot moved and was tracked in both the vertical and 
horizontal coordinates. The dot was controlled with 
the right stick only. The course input to the sys- 
tem was identical with that used during training, 
except its path of movement was rotated 45 deg. 
from the horizontal axis. Only the performance of 
the system in the horizontal coordinate was used in 
analyzing the results. 

Secondary visual task. In addition to the tracking 
task employed during training, Ss were required to 
perform simultaneously a second task which con- 
sisted in detecting and reporting range and bearing 
of targets on a simulated radar scope. 


Results * 


Effect of display magnification. 
proximately 20 trials, improvement in the per- 
formance with all magnifications appeared to 


After ap- 


have leveled off. The pooled data from the 
last five of the 28 training trials are plotted 
in Fig. 2. The solid lines show the effect 
of magnification on system error in the left 
graph and on display error in the right graph. 
Increasing magnification reduces system error 
and increases display error (p < .001). 
Effect of stress. The performance under 
task-induced stress is also shown in Fig. 2. 


3 The statistical methods employed in the analysis 
of these data were nonparametric tests developed by 
Wilcoxon (1949). Case I (Wilcoxon, 1949, p. 4) 
was used for determining the significance of the dif- 
ference between two conditions, and Case III (Wil- 
coxon, 1949, p. 6) was used for the comparison of 
several conditions. 


Simplified block diagram of manual tracking system. 


In the dotted-line graphs the data from all 
five stress conditions are pooled. It may be 
seen that the effect of stress is to increase all 
error scores, but to leave the rank order of con- 
ditions unchanged. There is a tendency for 
the absolute differences between stressed and 
unstressed performance to increase going from 
low to high magnification. Although this 
trend is not significant (p > .05) for system 
error scores, it is significant with display error 
(p < .01). 

Figure 3 shows the median percentage of 
deterioration in performance under stress for 
each of the stress conditions. The percentage 
of deterioration may be calculated either from 
system-error or display-error scores; the cal- 
culated quantity is mathematically the same 
whichever source is used for the calculation. 
In general, it may be seen that the high dis- 
play-magnification systems show greater dete- 
rioration. These functions are all significant 
(p < .05) with the exception of the arith- 
metic-task condition (10 > p > .05). How- 
ever, even in this case the deterioration with 
highest magnification was significantly greater 
(p < .01) than that with the lowest magnifi- 
cation system. 

Discussion 


The effect of magnification. The fact that 
increasing display gain improves system per- 
formance has led investigators (Battig, Nagel, 
& Brogden, 1955; Bowen & Chernikoff, 1957; 
Helson, 1949) to infer that operator perform- 
ance is improved. However, performance, 
judged in terms of the error displayed to the 
operator, is shown to deteriorate as display 
gain was increased. Thus, in terms of dis- 
play error, an opposite conclusion might be 
reached about “operator performance.” These 





Display Gain and 





——- UNSTRESSED 
o— ——e STRESSED 


(ARBITRARY UNITS) 


MEDIAN INTEGRATED SYSTEM ERROR 








} 
o4 06 
DISPLAY GAIN 





Fic. 2. 
curves represent performance without 
with stress. 
right graph. 


conflicting conclusions demonstrate that scores 
obtained from a man-machine system are not 
always a valid indicant of the response of 
the human operator in the system. 

The effect of stress. If the tracking device 
with each of the different gain settings em- 
ployed in the present study is regarded as 
constituting a different man-machine system, 
the results are in agreement with the previous 
experiment (Garvey, 1957) in demonstrating 
that the order of system merit is not changed 
by these particular forms of stress. With 
magnification as the system variable, the sys- 
tems giving the better scores in the absence 
of stress remain the better systems during 
stress. This is of considerable practical im- 
portance to the engineering psychologist, al- 
though generalization to all varieties of sys- 
tems is certainly not yet warranted. 

It is clear from the results that the effects 
of stress may be differently interpreted from 
the two measures of performance, i.e., the 
best tracking performance (in terms of sys- 
tem error), and the poorest tracking perform- 
ance (in terms of display error), show the 


Median integrated error as a function of display gain 


Task-Induced Stress 





-_——- UNSTRESSED 
om—=—@ STRESSED 


MEDIAN INTEGRATED DISPLAY ERROR (ARBITRARY UNITS) 











04 06 
DISPLAY GAIN 


The 


solid 


stress; the dotted curve represents that 
System error is shown in the left graph and display error in the 


greatest deterioration under stress. These re- 
sults serve further to point out the pitfalls of 
directly equating human performance with 
tracking scores in experiments conducted with 
man-machine systems. 

But this is not to say that no inferences 
about the human can be made. The results 
of the present experiment lead rather di- 
rectly to the hypothesis that stress, of the 
type employed in this study, will cause the 
greatest degradation in the performance of 
those position control systems which most tax 
the human operator. It is reasonable to sup- 
pose that the greater the error S perceives, the 
more effort he expends in attempting to re- 
duce it and the less of his capacity remains 
available for simultaneously contending with 
secondary tasks or other circumstances which 
also demand his attention. In this situation, 
effort devoted to dealing with intrusions of 
any sort must be subtracted from that previ- 
ously fully accessible to the tracking task, 
with the result that the latter suffers to the 
extent that the needed capacity is no longer 
available. 





W. D. Garvey and Jean B. Henson 





N PERFORMANCE 


1ON 


ARITHMETIC 
NCOMP 0-C 

2 HAND 

2 COORDINATE 


VIS 


DE TERIORAT 


UAL TASK 


MEDIAN PERCENT 








a 


02 04 06 08 
DISPLAY GAIN 





Fic. 3. Median percentage of deterioration in per- 
formance under conditions of stress as a function of 
display gain. The five curves represent the different 
stress conditions employed. 


The finding of the present study, that stress 
causes a greater deterioration in performance 
the higher the gain of the display, should not 
lead to the recommendation that display gain 
should always be set as low as possible in any 
practical control system. It must be remem- 
bered that the control designer works to mini- 
mize system error, regardless of what hap- 
pens to the error at the display. Although 
stress does exact a heavier toll with higher 
magnifications, it must be recalled that the 
higher gain systems, in terms of system error, 


always remained superior to the lower under 
every form of stress employed in this study. 


Summary 


An experiment was conducted to determine 
how the performance of man-machine systems 
is affected by factors which stress or over- 
load the human operator. Five systems, all 
of which had the same dynamics but differed 
in display magnification, were employed. Op- 
erators were given considerable training on all 
five systems, after which they were required 
to control the systems under a series of stress- 
ful conditions. Performance was measured 
both in terms of system error and error at 
the display. 

The results indicated that stress increased 
error in all systems, but the order of merit 
of the various systems was unchanged by 
stress. The results are discussed in terms of 
their relevance to the study of man-machine 
systems. The pitfalls of purely psychological 
interpretations of the behavior of tracking 
systems are outlined. 


Received Se 


tember 19, 1958 


References ~* 


Battig, W. D., Nagel, E. H., & Brogden, W. J. The 
effects of error-magnification and marker size on 
bidimensional compensatory tracking. Amer. J 
Psychol., 1955, 68, 585-594 

Bowen, J. H., & Chernikoff, R. The relationships 
between magnification and course frequency in 
compensatory tracking. U. S. Naval Res. Lab 
Rep., 1957, No. 4913 

Garvey, W. D. The effects of “task-induced” stress 
on man-machine system performance. U.S. Naval 
Res. Lab. Rep., 1957, No. 5015. 

Helson, H. Design of equipment and optimal hu- 
man operation. Amer. J. Psychol., 1949, 62, 473- 
497. 

Lazarus, R. S., Deese, J., & Osler, S. F. The effects 
of psychological stress on performance. Psychol. 
Bull., 1952, 49, 293-317 

Wilcoxon, F rapid approximate statistical 
procedures. Stamford, Conn.: American Cyanamid 
Co., 1949 


Some 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


EFFECTS OF FEEDBACK ON INSIGHT AND PROB- 
LEM SOLVING EFFICIENCY IN TRAINING 
GROUPS 


EWART E. SMITH anv STANFORD S. *-IGHT 


Fels Group Dynamics Center, University of 0:laware 


Human relations courses in management 


development programs currently very 
popular, despite a dearth of experimental evi- 
dence supporting the postulated learning prin- 
these courses. 


Two such principles have been tested in the 


are 


ciples used in constructing 
field experiment reported here. 


The hypotheses were: 


1. Feedback will: (a))increase group pro- 
ductivity and (0) increase self-insight. 
2. Subgroup structure will result in: 


(a) 
increased group productivity and in- 
creased self-insight. 


(d) 


Method 


Subjects and situation. The subjects (Ss) were 54 
male and 49 female first level supervisors (foremen) 
in a large eastern corporation. They were partici- 
pating in five-day management training course 
which stressed interpersonal factors in management. 
The course emphasizes experiential learning, with 
role playing exercises, demonstrations of perceptual 
fallibility, group discussions, etc., and only a few 
short informal lectures. Nine training groups vary- 
ing in size from 11 to 13 used. Each group 
contained both men and The Ss did not 
know they were participating in an experiment 

Experimental design In Experimental Condition 
A, small feedback groups of three or four members 
each were established. These feedback subgroups 
were scheduled to meet for 30 minutes at the end of 
each half day. However, 60 minutes per day proved 
to be more time than was readily available. Less 
and less time was given to the subgroups the 
original commitment to research on the part of the 
company staff faded Consequently, data 
were obtained on the amount of time spent 
in subgroups 

The Ss were outline to guide them in 
their feedback asked the individual 
to describe to his subgroup how he saw his own be- 
havior, why he behaved as he did, The sub 
group then asked to compare their 
perceptions of behavior with his account of it 
In addition, the subgroup was asked to give the in- 
dividual feedback on how his behavior affected the 
leadership problems of the group, whether he talked 


a 


were 
women 


as 


with time 
actually 
) 


given an 


sessions, which 
etc 
members were 


his 


209 


too much or too little, whether he was trying out 
new behavior or playing it safe, etc. 

Condition B was similar to Condition A in that 
the Ss were placed in subgroups of three or four 
members each. These subgroups were scheduled to 
meet for 30 minutes at the end of each half day, to 
discuss the course content covered the preceding half 
day. Such topics as what they had liked best or 
least, what they had learned, etc. were covered 
Condition B was designed as a control condition to 
determine if any changes produced by the feedback 
in Condition A could be attributed to subgrouping 
rather than feedback, and also as a test of the pos- 
tulate that subgrouping per se will increase learning 
in training groups. 

Condition C was a straight control condition with 
no induced feedback or subgrouping 

Three training groups were assigned to each con 
dition. The courses were conducted in the usual 
manner by the company’s training staff. Each group 
had two leaders who had had some training and 
considerable experience in conducting these courses 
These leaders were males holding third-level super 
visory positions in the company. E was introduced 
to the group as a consultant at the end of the course 
on Friday afternoon, when the testing was done 

Dependent variable measurements 
tivity was measured the number 
solved by each three person subgroup 


Group produc- 
by of problems 
In Condi 
tions A and B, the subgroups tested on the problem 
solving task the same which had been 
meeting during the week for feedback, or to discuss 
the content of the course. In Condition C the sub 
groups were produced, for the problem solving task 
by placing the first three people sitting together in 
one subgroup, placing the next three sitting together 
in the next subgroup, etc 

The problem solving task (Taylor & Faust, 1952 
is an adaptation of the parlor game “Twenty Ques- 
tions” in which the Ss had to identify items desig 
nated by the experimenter as either animal, vege 
mineral. Each problem had to be solved 
with less than 40 questions for the subgroup to re- 
ceive credit 


were ones 


) 


table, or 


No time limits were applied to the 
solving of particular problems, but the productivity 
the 
To each question 


of each subgroup was measured by number of 
problems solved in 10 minutes 
asked by a group member, E 

(a) (b) partly, (d) 
not in the usual sense of the word, 


(f) I don’t know (no charge for the question), (g) 


replied in one of the 


following ways yes, no, (c) 


sometimes, (¢) 


please restate the question (if the question was un 





210 


clear or could not be answered in one of the above 
ways). Difficult items such as wrench, ruby, and 
bread were used. This task has, in other investi- 
gations (Goldman, Bolen, & Martin, 1958; Smith, 
1957; Taylor & Faust, 1952), successfully discrimi- 
nated between groups whose effectiveness could be 
judged on other criteria. 

Self-insight was measured by having each conferee 
indicate, for each of 10 leadership roles, who (in- 
cluding himself) took the role well. An insight score 
was computed by determining the discrepancy (ig- 
noring the sign) between the number of roles S 
credited to himself, and the average number of roles 
credited to him by the two staff leaders. Self-insight 
of foremen has been shown to be highly related to 
the productivity of their departments (Nagle, 1954) 

An anonymous evaluation of the training course 
was also obtained from each conferee by asking him 
to rate the course on a five point scale in the fol- 
lowing three areas: how much he felt he benefited 
from the training, how much he felt he learned about 
himself as a result of the training, and how helpful 
he felt the training would be back on the job 


Results and Discussion ' 


As an analysis of the data on all measures 
indicated that Control Conditions B and C 
were not different, they have been combined. 
This failure of Conditions B and C to differ 
indicates that subgroup structure did not 
have the hypothesized effect on learning and 
that any effects occurring in Condition A re- 
sult from feedback alone. 

Plotting the group task scores revealed that 
normal distributions could not be assumed. 
Consequently, a nonparametric rank test was 


Table 1 


Mean Insight Scores Under Experimental 
and Control Conditions 


Data Under Each Condition 


Feedback 


Statistic on Insight — — 
Scores I II Ill Control 
Number of minutes in 
Feedback subgroups 230 193 165 
Mean 90 1.73 2.38 
n 10 11 12 


Note.—Low scores indicate insight CR on Feedback | and 
Control, 3.14**; ¢ on Feedback I and Feedback III, 2.12*. 


1 Statistical significance at the .05 level has been 
indicated by a single asterisk; at the .01 level or 
better by a double asterisk. 


Ewart E. Smith and Stanford S. Kight 


used (Edwards, 1954). By this method a 2 is 
obtained which is a normal deviate, evaluated 
by the use of the normal probability table. 
The median number of problems solved by 
the 10 three-person subgroups in the feedback 
condition was 2.00, compared to a median of 
zero for the 21 control subgroups. The z 
on these data is 2.27*. These differences are 
highly consistent, as revealed by categoriz- 
ing the subgroups according to whether they 
solved one or more problems or were unable 
to solve any. With such an analysis, we find 
that all of the feedback subgroups solved at 
least one problem, compared to nine of the 
control subgroups (the p on these data is .002 
by Fisher’s exact test). 

Feedback, then, consistently produced 
greater problem solving efficiency. It should 
be noted, however, that these data do not in- 
dicate whether feedback made the conferees 
more effective members of problem solving 
groups in general, or only made these par- 
ticular subgroups more efficient. 

If all feedback and control groups are com- 
pared on insight scores, they are not sig- 
nificantly different. However, as we see in 
Table 1, the strength of the feedback inde- 
pendent variable was gradually reduced by a 
decline in the amount of time the company’s 
training staff devoted to the feedback sub- 
groups. It is interesting to note the perfect 
negative correlation between the amount of 
time spent in feedback subgroups and the 
mean insight error scores. As indicated in 
Table 1, the first feedback group, which came 
closest to receiving the specified amount of 
time in feedback subgroups (300 minutes) 
had significantly lower insight scores than 
either the last feedback group or the con- 
trol groups. 

The mean rating of the course by the con- 
ferees in the feedback condition was 12.97, 
which is significantly lower (CR 2.27*) than 
the mean of 13.66 for the control condition. 
In view of the superiority of the feedback 
condition as indicated by the problem solving 
and insight measures, it would appear that 
the frequent use of the opinions of conferees 
for evaluating industrial training programs 
should be re-evaluated. 





Effects of Feedback 


Summary 


An experiment was conducted in a field 
setting to investigate two of the learning prin- 
ciples utilized in human relations courses. 
The Ss were 103 first line supervisors, in 
groups of about 12, in a one week, highly 
participative management course. In the ex- 
perimental condition, the training groups were 
divided into three-person subgroups which 
met twice daily to give each other personal- 
ized feedback on their behavior in the group. 
In one control condition, there were twice 
daily meetings of three person subgroups 
which discussed the content of the course, and 
in the other control condition there was no 
subgrouping. 

The hypotheses tested in the present study 
were: 


1. Feedback will: (a) increase group pro- 
ductivity and (0) increase self-insight. 

2. Subgroup structure will result in: (a) 
increased group productivity and (0b) in- 
creased self-insight. 


The data indicated that personalized feed- 
back markedly, and consistently, improved 


211 


group problem solving efficiency. Under some 
conditions, feedback improved self-insight. 
Anonymous evaluations of the course by the 
trainees favored the control conditions, indi- 
cating that conferee ratings may not be an 
adequate basis for evaluating such courses. 
The hypotheses regarding subgroup structure 
were not supported. j 


Received October 13, 1958. 
Early Publication. 


References 


Edwards, A. L. Statistical methods for the behav- 
ioral sciences. New York: Rinehart, 1954. 

Goldman, M., Bolen, M. E., & Martin, R. B. The 
effect of group structure on the performance of 
groups engaged in a problem solving task 
Psychologist, 1958, 13, 353. 

Nagle, B. F. Productivity, employee attitude and 
supervisor sensitivity. Personnel Psychol., 1954, 7, 
219-233. 

Smith, E. E. The effects of clear and unclear role 
expectations on group productivity and defensive- 
ness. J. abnorm. soc. Psychol., 1957, 55, 213-217. 

Taylor, D. W., & Faust, W. L. Twenty questions: 
Efficiency in problem solving as a function of size 
of group. J. exp. Psychol., 1952, 44, 360-368 


Amer 





Journal of Applied Psychology 
Vol. 43, No. 3, 1959 


PERSONALITY TEST SCORES IN THE MANAGE- 
MENT HIERARCHY: 


REVISITED 


HENRY D. MEYER anp ALAN J. FREDIAN 


Stevenson, Jordan & Harrison, Inc., Chicago 


One major purpose of this study was to re- 
peat, with an entirely new sample, the man- 
agement hierarchy validation study of the 
Employee Questionnaire, the personality test 
developed and used by Stevenson, Jordan & 
Harrison psychologists in their management 
personnel appraisals. The original study 
(Meyer & Pressel, 1954) was done on 459 
cases, covering the period of July, 1949 to 
February, 1952. The present study was done 
on 678 cases, covering the period from July, 
1955 to May, 1957. The soundness of the 
use of the Employee Questionnaire in the ap- 
praisal of management candidates depends, in 
part, on the consistency of management hier- 
archy trends in its various personality scales. 
This consistency was to be tested. 

A second major purpose of this study was 
to perform, for the first time, a management 
hierarchy validation study, of nine new scales 
added to the Employee Questionnaire follow- 
ing the original study. The seven scales of 
the EQ-B of the original study were objec- 
tivity, social dominance, social extroversion, 
drive, detail, emotionality, and adjustment 
(poor). The additional nine scales which, 
together with these original seven scales, con- 
stitute the EQ-C of the present study, were 
social consideration, judgment and decision, 
adjustment somatic, psychopathic tendencies, 
drive persistence, recognition anxiety, per- 
sonal achievement motivation, compensatory 
achievement motivation, and independent 
achievement motivation. In addition, the 
original drive scale was modified to a more 
limited scale: drive irritability. 

In the original study (Meyer & Pressel, 
1954), social dominance, detail, emotionality, 
and adjustment (poor) were found to have 
observable trends and statistically significant 


Fs with hierarchy. Also, efforts were made 


to control for age, education, occupation, and 


bias. The general aim of the present study 


21 


2 


was to parallel, as far as possible, the original 
study on a new sample tested with the modi- 
fied and expanded Form C of the EQ person- 
ality test. However, the results for occupa- 
tion will not be presented in the present study. 


Procedures 
The Hierarchy Categories 


The original concept of the industrial management 
hierarchy (Meyer & Pressel, 1954) was modified to 
include an additional level of jobs, labeled tech- 
nicians, as Level V just above general factory, and 
clerical employees now at Level VI. The technician 
level encompassed such job titles as time study man, 
process man, production scheduler, draftsman, tool 
designer, lead man, and group leader. Otherwise, 
the hierarchy (Table A)! was the same as originally, 
but generalized as follows: Level I, corporate officers 
and division general managers; Level II, works man- 
agers and heads of major functional areas of non- 
officer status; Level III, supervisors of departments 
of major functional areas and exceptional sales and 
engineering staff jobs of equivalent status; Level IV, 
first line production and office supervisors and staff 
specialists; Level V, technicians; Level VI, general 
factory and clerical employees 


The Sample 


Inasmuch as in three out of six hierarchy levels 
the new data sample would not yield an even 100 
cases as in the original study (Meyer & Pressel, 1954), 
it was decided to take from the S. J. & H. appraisal 
files, all available cases at each level that met the 
original criteria (Meyer & Pressel, 1954) for selec- 
tion and placement in a particular hierarchy level 
The end result a relatively small number of 
cases for Levels I and II, 30 and 44 cases; 


was 


I 


a rea 


1A detailed summary of hierarchy definitions, in- 
tercorrelations, hierarchy, education and age trends, 
comparative magnitudes of trends, and_ statistical 
analysis of hierarchy trends for all EQ-C scales, as 
found in Tables A, B, C, D, E, F, G, and H, has 
been deposited with the American Documentation In- 
stitute. Order Document No. 5884 from ADI Aux- 
iliary Publication Project, Photoduplication Service, 
Library of Congress, Washington 25, D. C., remit- 
ting $1.25 for 35 mm. microfilm or 6 X 8 in. photo- 
copies. Make checks payable to Chief, Photodupli- 
cation Service. 





Personality 


sonably good N for Levels III, V and VI, 144, 78, 
and 129 cases; and a very large N for Level IV, 253 
cases. 


The Personality Test 


The data for the present stud; were obtained from 
the EQ-C, a major revision of the EQ-B. The 
EQ-C was developed by an item analysis of 142 
items, consisting of most of the original 75 items of 
the EQ-B, and additional items designed to form six 
of the nine new scales listed previously. In this 
analysis, a major effort was made to eliminate items 
that had high correlations with objectivity, the bias 
key. Also, the attempt was made to reduce, as far 
as possible, the use of items on more than one scale. 
The end result, after a second round of development 
and item analyses for the three achievement motiva- 
tion scales, was a 105-item test with scales of 7 to 
12 items. While similar scales for the EQ-B and 
EQ-C do not have exactly the same items, the cor- 
relations between similar scales ranged from r= + 
54 between drive and drive irritability, and r= + 
.62 between adjustment (poor) and adjustment psy- 
chological to r=+ .80 for social dominance and 
r=-+ .82 for detail (Table B).! 

The introduction of six of the nine additional 
scales of the EQ-C was for the purpose of broaden- 
ing the possible interpretations derived from the test 
and to make up for some previous deficiencies in 
coverage of characteristics believed to be critical to 
management appraisal. Social consideration was de- 
signed to fill a gap in the area of dealing with peo- 
ple sympathetically; judgment and decision, to add 
information about decisiveness and impulsiveness; 
drive persistence, to reveal staying power rather 
than intensity of effort; adjustment somatic, to re- 
veal susceptibility to psychosomatic reactions to job 
and personal pressures; psychopathic tendencies, to 
reveal antisocial inclinations; and recognition anx- 
iety, to reveal anxiety about pleasing superiors. The 
selection of scale items was by face validity followed 
by item analysis of all the items of the test against 
high and low scoring groups on each scale. 

The three achievement motivation scales were in- 
troduced after the final selection of items for the 
other scales of the EQ-C. They were based on Mc- 
Clelland’s (1953) research on achievement motiva- 
tion, and Adler’s (1929) compensation theory of. mo- 
tivation. The personal achievement scale was formed 
by selecting, from all the established scales, items 
which conformed to McClelland’s signs of achieve- 
ment motivation. The independent achievement scale 
was developed from new items relating to the foster- 
ing of independence and resourcefulness by the candi 
date’s parents and environmental circumstances (Mc- 
Clelland, 1953). The compensatory achievement scale 
was developed from new items relating to parental 
rejection (Adler, 1929). These three scales were sub- 
sequently subjected to an independent item analvsis 
and modified accordingly. 

Intercorrelations of the 16 the EQ-C 
Table C),! based on 100 cases, ranged from r= .00 


scales of 


Test Scores 213 


te r=— .65, the latter between emotionality and 
judgment and decision. Out of 120 intercorrelations, 
14 had an r of + or —.50 or higher. Of these 14, 
most had to do with the relations between social 
consideration, judgment and decision, emotionality, 
adjustment psychological, adjustment somatic, drive 
irritability, and recognition anxiety. These intercor- 
relations generally indicated that low judgment and 
decision and low social consideration scores went 
with high irritability, emotionality, anxiety, and poor 
adjustment scores, and that scores on one kind of 
poor adjustment scale went with scores on another. 
The latter was to be expected in the light of the 
attempt, in constructing the scales, to identify more 
refined subcomponents of a general adjustment factor. 

While the test-retest reliability of the EQ-C scales 
has not yet been measured, previous test-retest reli- 
ability of the EQ (Rothe, 1950) at r= .69 to .84 
has been satisfactory for its use as an appraisal aid. 
The EQ-C and earlier forms have sacrificed the split- 
half reliability that might have been obtained with 
fewer and longer scales of 40 items or more. Such 
long scales are not feasible in a short industrial ap- 
praisal test wherein the aim is to get as many clues 
about individual characteristics as possible in a 
short timie. These clues can then be probed in an 
intensive follow-up interview 


Statistical Procedures 


Statistical procedures ? were similar to the original 
study (Meyer & Pressel, 1954) in that mean scale 
scores for different hierarchy groups, educational 
groups, and age groups were computed. Again, a 
single classification F test analysis of variance for 
hierarchy alone was used to test the statistical sig- 
nificance of the observed mean differences on hier- 
archy for each test scale (Table D).1 The three 
scales lacking homogeneity of variance were trans- 
formed by the arc-sine transformation (Snedecor, 
1946) for the analysis of variance (Table D).' For 
two scales, where the data were too skewed to meet 
assumptions of normality, the data were ranked be 
fore computing H (Table D). 

Because of the irregular N in the hierarchy groups, 
a different approach was used to test for the inde- 
pendence of hierarchy trends from age and educa 
tion trends than the double classification F test used 
previously (Meyer & Pressel, 1954). A simple count 
was made of the number of changes, in the predomi- 
nant direction of change, of the mean EQ-C scale 
score for each step up the hierarchy (variable group) 
within each separate educational or age (nonvari 
able) group (Table A) The probability of this 


2 The authors are indebted to Jere P. Wilson for 
carrying out the statistical procedures and contribut 
ing to their formulation 

3A detailed report of the statistical procedures of 
this method, along with detailed findings on the in- 
dependence of scale trends in hierarchy, education 
and age, as found in Table A, has been deposited 
with the American Documentation Institute. Order 
Document No. 5883 from ADI Auxiliary Publica- 





Henry D. Meyer and Alan J. Fredian 


Table 1 


EQ-C Test Scale Means and Sigmas According to Hierarchy Level 


Hierarchy Level 





EQ-C Scale 


III IV 
N=144 N=235; 


M SD 








Independent Achievement*** 
Detail*** 

Social Dominance 
Judgment and Decision** 
Recognition Anxiety*** 

Drive Irritability** 

Adjustment Psychological* 

Social Extroversion** 

Objectivity 5.7 


++? 


~ | 


1.9 . ‘ : 

1.6 30 Ly 5.8 
1.9 ‘ ; 5.6 
1.6 . ‘ 6.3 
1.3 : . 4.3 
1.6 w i2 4.5 
13 f ; 3.3 
22 ‘ : : 
1.9 : . 5.9 


IOP PANN! | 
Nr rPruwUco wo 


mn 


*, ** and ** indicate, respectively, significance at the .05, .01, and .001 levels for a single classification F test. 


number of changes occurring by chance was com- 
puted.? While this test of independence is weaker 
than the double classification F test, relative to 
proper consideration of the size of the N and the 
variance for each mean score used, it is a strong 
test of trend in that the count of changes in the 
predominant direction is a direct measure of con- 
sistency of the trend in that direction. 


Results 
Hierarchy Trends 


The major hierarchy trend results are shown 
in Table 1 and, for all scales, in Table E.? 
In the present study, three of the four scales 
that had observable trends and significant Fs 
on hierarchy in the EQ-B (Meyer & Pressel, 
1954), also have them in the same trend di- 
rection in the EQ-C. They were social domi- 
nance, detail, and adjustment psychological. 
Emotionality in the EQ-C did not achieve a 
significant F at the .05 level although it had 
a small trend. Similarly, objectivity and so- 
cial extroversion did not have observable hier- 
archy trends in either the EQ-B or EQ-C. 
Social extroversion had F test significance in 
the EQ-C, but not in the EQ-B. Drive irrita- 
bility did have an observable trend and sig- 
nificant F in the EQ-C, but drive did not in 
the EQ-B. 


tions Project, Photoduplication Service, Library of 
Congress, Washington 25, D. C., remitting $1.25 for 


35 mm. microfilm or 6 X 8 in. photocopies. Make 
checks payable to Chief, Photoduplication Service. 


Of the seven original scales of the EQ-B, 
detail (downward) and social dominance (up- 
ward) had the second and third most marked * 
scale trends with hierarchy (Table F)* of all 
the EQ-C scales, and were two of four hier- 
archy trend scales in the EQ-C with F test 
significance at the .001 level. Ascending so- 
cial dominance and descending detail scores 
seem consistent findings in the ascending 
management hierarchy for repeated studies. 

Of the new scales of the EQ-C, independent 
achievement was not only the most marked 
in terms of magnitude of hierarchy trend (up- 
ward) of all the EQ-C scales, but it was also 
significant at the .001 level and there were no 
reversals of the trend at any of the six hier- 
archy levels. Recognition anxiety had the 
fifth most marked hierarchy trend (down- 
ward) and was significant at the .001 level. 
Judgment and decision (upward) and drive 
irritability (downward) were the fourth and 
sixth most marked trend scales in the EQ-C 
and both were significant at the .01 level. 

Of the remaining new scales, social consid- 
eration (upward), adjustment somatic (down- 
ward), psychopathic tendencies (downward), 
and personal achievement motivation (down- 


Referring to the magnitude of the difference be- 
tween the average scale scores at the hierarchy levels 
constituting the bottom and the top of the trend, 
covering several hierarchy levels. 





Personality Test Scores 


Table 2 


EQ-C Test Scale Means and Sigmas According to Education Level 


Education Level 





Graduate 
N=600 


2 Years 
High 
School 
N=29 


4 Years 
High 
School 

N= 246 


Grade 
School 
N=34 


2 Years 
College 
N=117 





SD 
1.9 
1.6 
1.9 

Judgment and Decision 1.4 

i2 
1.8 
1.4 
1.9 
64 2.4 


EQ-C Scale 


Independent Achievement 
Detail 
Social Dominance 


Recognition Anxiety 
Drive Irritability 
Adjustment Psychological 
2.2 
2.3 


Social Extroversion 


Objectivity 6.2 


ward) all had slight hierarchy trends and 
were significant at the .05 level. 

Of 16 EQ-C scales, only four have no 
observable trends with hierarchy. They are 
objectivity, social extroversion, drive persist- 
ence, and compensatory achievement motiva- 
tion. Only four scales did not achieve F test 
significance of their hierarchy means at the 
.O5 level or better. They were objectivity, 
emotionality, drive persistence, and compen- 
satory achievement. 

The greatest difference in hierarchy trend 
findings between the original study (Meyer & 
Pressel, 1954) and the present one is the 
marked failure of the top hierarchy group to 


M SD 


M SD 


M SD M SD 





5.4 
6.1 
5.0 
5.0 
5.2 
4.7 
4.3 
6.5 
5.6 


2.0 
1.2 
1.9 
1.6 
1.5 
1.7 
1.4 
1.9 
y 


6.0 
5.9 
6.2 
5.6 
4.6 
4.5 
3.6 
7.3 
5.8 


1.6 
1.4 
1.7 
1.8 
1.5 
1.6 
1.6 
2.0 
2.1 


5.7 

5.8 

5.6 

6.1 

4.4 

4.5 

3.3 

78 2. 
5.8 2.2 


2.4 
1.6 
1.7 
1.8 
1.9 
1.6 
1.9 
2.4 
2.0 


5.4 


3.9 
4.3 
2.9 
7.9 
6.0 
follow the otherwise observable hierarchy 
trends (Table 1) (Table E).2_ In all the 
12 hierarchy trend scales, except independent 
achievement motivation, the Hierarchy Level 
I average constitutes a reversal of the hier- 
archy trend established in the previous levels. 
In the original study, there was no such re- 
versal for Level I (Meyer & Pressel, 1954). 


Education and Age Trends 


These trends are shown in Tables 2 and 3 
and Tables G and H.' In the current study, 
education trends in the EQ-C scales were 
much more marked than in the previous study 


Table 3 


EQ-C Test Scale Means According to Age Level 


Years 
20-30 
N=126 


EQ-C Scale 


Independent Achievement 
Detail 

Social Dominance 
Judgment and Decision 
Recognition Anxiety 
Drive Irritability 
Adjustment Psychological 
Social Extroversion 
Objectivity 


Years 
30-40 
N=318 


M M 


6.2 
5.3 


5.9 
5.7 





216 


(Meyer & Pressel, 1954), possibly because 
the education categories were extended to in- 
clude graduate training at the top and two 
years of high school and grade school at the 
bottom. Furthermore, all 16 EQ-C scales had 
observable trends with education. The mag- 
nitude of trend differences was greatest for 
education and least for age (Table F).° Av- 
erage magnitude of all observed scale trends * 
in absolute mean score values was 1.42 for 
education; .87 for hierarchy; and .54 for age. 

With the exception of detail and those 
scales that had no trend for age, the direc- 
tions of observed scale trends with education 
were opposite of those for age, whereas all 
observed hierarchy trends were in the same 
direction as those for education (Table F).' 
While no single classification F test of sig- 
nificance of mean EQ-C scale score differ- 
ences was carried out for education and age 
categories because of budget limitations, it 
might well be predicted from the magnitude 
and consistency of the trends that many of 
the scales having trends with education and 
few of the scales having trends with age 
would also have F test significance. It should 
be kept in mind that in each of these single 
variable classifications of hierarchy, education 
and age, as shown in Tables 1, 2, and 3, only 
the variable described is controlled. Actu- 
ally, in this management sample, all three 
variables are interrelated as shown in Table 4. 


Independence of Trends 


Table A* shows the trend results for hier- 
archy, age, and education when two of the 
variables are controlled. In the execution of 
the previously described procedures * for com- 


Table 4 
Age and Education Means According to 
Hierarchy Level 
Hierarchy Level 


II Ill IV 


Age* A. 43.0 40.0 369 
Education** 4. 5.0 4.6 3.9 


*In years. 

** 6 = Graduate Level Work; 5 = 4 Years College; 4 = 2 
Years College; 3 = 4 Years High School; 2 = 2 Years High 
School; 1 = Grade School. 


Henry D. Meyer and Alan J. Fredian 


puting independence of trends in a double 
classification system without using analysis of 
variance, none of the hierarchy trends of the 
EQ-C scales was found to exist independently 
of education at the .05 level of significance 
when education was controlled. When age 
was controlled, hierarchy trends persisted at 
the .05 level of significance for social domi- 
nance, judgment and decision, detail, adjust- 
ment psychological, and psychopathic tend- 
encies. 

On the other hand, when hierarchy is held 
constant, education trends at the .05 level 
of confidence persist in the EQ-C scales for 
judgment and decision, detail, psychopathic 
tendencies, recognition anxiety, compensatory 
achievement, and independent achievement. 
With hierarchy controlled, significant age 
trends persist for objectivity, social consid- 
eration, judgment and decision, and drive per- 
sistence (Table A).° All of these significant 
trends with two controlled variables are in 
the same direction as the trends for educa- 
tion, hierarchy, and age when only one vari- 
able is controlled. 


Discussion 


Several points need the clarification of dis- 
cussion to place this study in clear perspec- 
tive. These points are: first, the observed, 
rather than statistically proven, existence of 
the EQ-C scale trends; second, the possibility 
of interpreting the observed trends as due to 
differences in objectivity or social desirability 
bias; third, the possibility of interpreting the 
observed hierarchy trends and reversal of 
trends at the top level as a function of un- 
derlying education and age trends in both 
personality scale scores and hierarchy com- 
position; fourth, the possibility that inde- 
pendent achievement motivation is the only 
independent personality scale predictor for 
hierarchy, the other personality scale trends 
being by-products of selection by education 
and advancement by age (experience). 


Validity of Observed Trends 

It should be kept in mind that neither the 
single classification F test nor the number 
of changes in the predominant direction of 
change tests are final statistical proof of the 





Personality 


validity of trends. Trends reported in this 
study are observed upward or downward pro- 
gressions of scale score means in successive 
categories of hierarchy, age, or education. 
The magnitude of the changes and the num- 
ber of reversals to be allowed is a matter of 
judgment. In this study, the F test analysis 
of variance helps the certainty of the trend 
observation in that it implies that the ob- 
served differences between the means are not 
due to chance and are likely to be repeatable. 
It in no way certifies that there is a linear 
progression. In general practice, however, 
there must be a certain magnitude of mean 
score differences and consistency of progres- 
sion, or the scale will not come out with F 
test significance. However, curvilinear rela- 
tionships, as well as linear trends, will also 
give F test significances, as occurred with so- 
cial extroversion. Therefore, the F test of 
the single classification analysis of variance 
does not prove a trend as we have used the 
term in this study. The standard for trend 
is observation of progression, supported by 
an F test for verification that the observation 
of mean differences is not due to chance. 
The inadequacies of the number of changes 
test of trend have been pointed out. It as- 
sures the improbability of such a consistent 
progression occurring by chance, but it does 
not insure that the means on which the pro- 
gression is based are reliable. This test was 
only used for the testing of independence of 
trends from one of the other variables be- 
cause of the unsuitability of the data to the 
double classification analysis of variance. 


Objectivity and Social Desirability Bias 


In the original study, objectivity seemed 
much more of a general bias indicator than in 
the present study. It is to be remembered 
that in the revision of the EQ-B to form the 
EQ-C, special care was taken to eliminate 
items from other scales that also discrimi- 
nated high low objectivity 
groups. In doing so, the objectivity scale 
seems to have been changed so that it no 
longer is as independent of hierarchy, age, 
and education levels. High objectivity, de- 
fined as willingness to admit to some minor 
undesirable behavior characteristics, has be- 


between and 


Test Scores 


come a favored trait, one which declines with 
age and increases with education (Tables 2 
and 3) (Table A).* However, there is no 
evidence that the trends in hierarchy are a 
function of objectivity bias (Table 1). While 
the hierarchy trends are generally in the di- 
rection of a more socially desirable profile of 
scale scores, there is no evidence that objec 
tivity scores parallel and make possible this 
progression as the hierarchy ascends. 

Inasmuch as social dominance and social 
extroversion appear to be relatively unrelated 
to objectivity bias in the EQ-C, as shown by 
intercorrelations (Table B),! an extensive so- 
cial desirability bias study along the lines 
laid down by Edwards (1957a) was unde 
taken and is now well along. It uses the data 
of the present study and there is enough 
analysis completed to indicate that degree of 
social desirability bias is certainly a strong 
determinant of magnitude of scores on the 
more obviously socially desirable or undesir 
able scales such as social extroversion and ad- 
justment somatic, but not with almost neutral 
scales such as detail. This social desirability 
study also indicates that most hierarchy and 
education trends of the present study are in 
a socially approved direction, and age trends, 
such as they are, generally are not. How- 
ever, there are some notable exceptions to this 
generalization. Detail has declining trends 
with all three variables and is socially neutral 
Objectivity is socially undesirable and social 
extroversion is socially desirable, but neither 
have trends with hierarchy and both have up 
ward trends with education. 

As Edwards (1957a) has pointed out, peo- 
ple are inclined to answer personality test 
questions in a socially desirable way, to a de- 
gree depending on the social desirability of 
the question. They still differ rather widely 
in their willingness to attribute socially un- 
desirable behavior to themselves, depending 
possibly on the strength of their personal 
attitudes of social inferiority or social su- 
periority. The question is whether individual 
differences in this area can account for the 
major part of these trends of personality test 
scores in management hierarchy or, for that 
matter, in education and age. The answer to 
this question must await an intensive analy- 
sis of the data of the social desirability bias 





218 


study. While the F tests of hierarchy, edu- 
cation, and age means of the new 40-item 
social desirability bias scale of the EQ-C are 
not complete, the means themselves have no 
observable trend with hierarchy or age. How- 
ever, there is a distinctly observable upward 
trend of this social bias scale with education. 
Studies are under way of the relation of so- 
cial desirability bias scores on the EQ-C to 
scale scores on the Edwards Personal Pref- 
erence Test (Edwards, 1957b), which also 
will help to obtain a better answer to this 
question. 


Hierarchy Level I Trend Reversals 


Another finding which requires further dis- 
cussion is the marked failure of the top hier- 
archy group to follow the otherwise observ- 
able hierarchy trends in EQ-C scale average 
scores. While it is admittedly possible that 


the reversals of observed trends in the top 
hierarchy group were due to an inadequate 
sample (30 cases in Level I), another explana- 
tion seems more plausible. The suggested hy- 
pothesis is that the trend reversals in Hier- 
archy Level I are a function of the composi- 
tion of the sample with regard to actual age 


and education. While there is an actual age 
and education trend with hierarchy that is 
quite marked (Table 4), in Hierarchy Level 
I, the actual education trend is reversed. 
That is, the people in Level I have some- 
what less education on the average than those 
in Level II, but have a higher age. Since the 
hierarchy trends in the EQ-C scales tend to 
be in the same direction as the education 
trends in the EQ-C scales, a reversal of EQ-C 
scale trends with a reversal of actual educa- 
tion trends would be expected. Also, since 
the EQ-C age trends tend to be the opposite 
of EQ-C hierarchy trends, and there is an 
actual age trend in the hierarchy, it would be 
expected that the reversal for this maximum 
age group of Level I would be greater than 
that accountable for by the education re- 
versal alone. Hence, more specifically, the 
reversal of the actual education trend and the 
continuation of the actual age trend could 
account for a good part of the marked trend 
reversals observed in EQ-C scales for Hier- 
archy Level I. 


Henry D. Meyer and Alan J. Fredian 


The continuation of the hierarchy trend for 
the independent achievement motivation scale 
is the main exception to this hypothesis, and 
suggests that being in Level I rather than in 
Level II in the management hierarchy is more 
a function of greater achievement motivation 
or independent resourcefulness than of better 
personality in other respects, as defined by 
society. It should be kept in mind, of course, 
that Hierarchy Level I and II samples are 
relatively small and that they are dargely 
drawn from medium-sized, rather than large, 
companies. What determines becoming an 
officer or general manager in small and me- 
dium-sized businesses may not be the deter- 
minant for very large businesses. 


Personality Versus Experience and Education 

Two conflicting general hypotheses regard- 
ing personality test scores in the manage- 
ment hierarchy are suggested. ‘The first is 
that there are no strong trends except for in- 
dependent achievement. And, except for that 
one scale, the trends observed are a function 
of age and education influences on scale scores 
and the actual age and education composition 
of the management hierarchy. According to 
this hypothesis, management selection and 
advancement is by education and experience 
(age), supported by resourcefulness and in- 
dependence in achieving job results. The ob- 
served hierarchy trends in personality test 
scores, except for independent achievement, 
are accidents of educational and age influ- 
ences on personality. 

The second general hypothesis is that po- 
sition in the management hierarchy is the re- 
sult of a selective process whereby more in- 
telligent ° people with better personalities, as 
defined by society, and stronger independent 
achievement motivation generally tend to rise 
higher in the hierarchy with age and experi- 
ence than their colleagues less talented in 
these respects. In this process, they are con- 
stantly fighting a rear-guard action against 
the retrograde effects of age in all these abil- 
ity variables. According to this hypothesis, 


5A hierarchy study of intelligence test scores has 
been completed by E. L. Kendall of the S. J. & H. 
permanent staff, using the same sample as the pres- 
ent study. It is being prepared for publication at 
this time. 





Personality 


people with superior ability in intelligence, 
personality, and achievement motivation also 
tend to achieve a higher education level. At 
the time of testing, education is completed 
while hierarchy ascension is not. Therefore, 
the relationship of these measures to educa- 
tion is greater than to hierarchy. Unfortu- 
nately, higher education also makes for more 
social sophistication and inclination to social 
desirability bias, as contrasted with extremely 
poor education. However, in the range of 
the management hierarchy education averages 
(four years high school to four years col- 
lege), the difference in social desirability bias 
is not marked enough to interfere greatly with 
personality test score results. 

This broad ability selection hypothesis is 
weakest when applied to small businesses, 
where the owner starts at the top through 
unusual initiative, and gradually builds a 
hierarchy under him. It fits best the large, 
established business where promotion is from 
within the firm, from the bottom to the top 
and the hierarchy is well structured. 


Summary 


The present study achieved one of its ma- 
jor purposes in showing that a 678-case, 1955— 
1957 sample of management appraisal candi- 
dates had the same observed trends in per- 
sonality test scores with hierarchy as did a 
459-case, 1949-1952 sample. The 1955-1957 
study utilized data from the EQ-C, a revision 
of the EQ-B utilized for the 1949-1952 study. 
The observed trends for personality test scores 
in hierarchy were the same for both studies 
in respect to the closely correlated scales of 
social dominance, a marked upward trend; 
detail, a marked downward trend; and ad- 
justment psychological, a mild downward 
trend. Also, F tests on hierarchy were sig- 
nificant for these scales. In both studies. 
objectivity and social extroversion showed no 
hierarchy trends and had no F-test signifi- 
cance. Only drive and emotionality were 
different. Drive irritability was negative in 
trend and had no F-test significance on the 
EQ-B, and was negative in trend and had F- 
test significance on the EQ-C. Emotionality 
had F-test significance and was negative in 
trend on the EQ-B, and did not have signifi- 


Test Scores 219 
cance and was negative in trend on the EQ-C. 
It was emphasized in the discussion, that an 
analysis of variance F test is not absolute 
proof of trend and that the trends presented 
were merely observed. The F test supports 
the probability of the observed mean differ- 
ences not being due to chance. 

In addition, nine new scales, developed for 
the EQ-C after the EQ-B study, were ob- 
served for hierarchy trend. Of these, inde- 
pendent achievement and judgment and deci- 
sion showed the most marked, positive trends 
and had the greatest F test significance; and 
recognition anxiety had the most marked, 
negative trend and F test significance. Of 
the other five new scales, social consideration, 
adjustment somatic, psychopathic tendencies, 
and personal achievement showed mild trends 
with hierarchy, all negative except for social 
consideration, and all had F-test significance. 
Compensatory achievement and drive persist- 
ence showed no trends with hierarchy sup- 
ported by F-test significance. Of all the scale 
trends on hierarchy of the EQ-C, only inde- 
pendent achievement motivation was not re- 
versed at Hierarchy Level I. It was the only 
perfect trend and the only trend to be greater 
in magnitude with hierarchy than with edu- 
cation. 

EQ-C scale trends of an even more marked 
nature, but in the same direction as for hier- 
archy, were noted for education. Opposite, 
but lesser trends were observed for age, ex- 
cept for detail where the trend direction was 
the same. Neither education nor age trends 
in EQ-C scales were tested for support by 
analysis of variance. Education had observ- 
able trends in all 16 scales, but hierarchy and 
age did not. Also, there was a marked trend 
for actual age and education to increase with 
hierarchy. Again, these trend observations 
were not supported by analysis of variance. 

Part of the trend reversal for Hierarchy I 
was hypothesized to be accountable for in 
terms of the reversal of the Hierarchy Level I 
group from the actual education trend in hier- 
archy. This allowed the negative influence of 
the age trend to continue and the positive 
education trend to be reversed for Hierarchy 
Level I, EQ-C scale averages. 

A double classification control new 
type, for independence of paired variables, 


of a 





220 


carried out by counting changes in the trend 
direction and computing the probability of 
these occurring by chance, was conducted. 
This indicated that none of the hierarchy 
trends was demonstrably independent of edu- 
cation when education was controlled. Five 
scales showed hierarchy trends when age was 
controlled. Six scales showed education trends 
when hierarchy was controlled, and four scales 
showed age trends when hierarchy was con- 
trolled. These double classification studies 
were held to be only rough indicators of in- 
dependence of trends because the counting 
method of measuring trend did not give any 
weight to the N in each category, except to 
arbitrarily specify that N must be a certain 
magnitude to be counted. 

Brief reference was made to additional S. J. 
& H. studies, now in process, of the influence 
of social desirability bias on the EQ-C and 
the Edwards Personal Preference Schedule, 
and of intelligence test score trends in the 
management hierarchy. These were men- 
tioned to further support the hypothesis that 
trend results in hierarchy are not primarily 
a function of social desirability bias. 

Two alternative and opposed general hy- 


potheses were offered as possible explanations 


of the general findings. One suggested hy- 
pothesis was that personality scale trends in 
management hierarchy are a function of so- 
cially positive and marked educational trends, 
combined with small and socially negative 
age trends for the personality scales. Ac- 
cording to this first hypothesis, hierarchy se- 


Henry D. Meyer and Alan J. Fredian 


lection is by education. Management hier- 
archy trends in personality scale scores are 
merely an accident of the education and age 
composition of the hierarchy and only inde- 
pendent achievement is a truly independent 
personality predictor of hierarchy level. 

A second general hypothesis was suggested 
contrary to the first: that personality scale 
trend results in hierarchy are a function of 
actual hierarchy composition in which more 
able people in intelligence, personality, and 
independent achievement motivation win out 
with age and experience in the competition to 
ascend the management hierarchy and, in 
earlier years, to achieve higher education. 


Received December 31, 1958. 
Early Publication. 


References 


Adler, A. The practice and theory of individual psy- 
chology. New York: Harcourt, Brace, 1929. 

Edwards, A. L. The social desirability variable in 
personality assessment and research. New York: 
The Dryden Press, 1957. (a) 

Edwards, A. L. Edwards Personal Preference Sched- 
ule, Manual. New York: Psychological Corpora- 
tion, 1957. (b) 

McClelland, D. C., Atkinson, J. W., Clark, R. A., & 
Lowell, E. L. The achievement motive. New 
York: Appleton-Century-Crofts, 1953. 

Meyer, H. D., & Pressel, G. L. Personality test 
scores in the management hierarchy. J. appl. 
Psychol., 1954, 38, 73-80. 

Rothe, H. F. Use of an objectivity key on a short 
industrial personality questionnaire. J. appl. Psy- 
chol., 1950, 34, 98-101 

Snedecor, G. W. Statistical methods (4th ed.) 
Ames, Iowa: Iowa State Coll. Press, 1946 








MeceGRAW-HILL 


Books 





THE PSYCHOLOGY OF LEARNING 
3 pare Soe. toy Hopkins University. New Second Edition. McGraw-Hill Series in Psy- 
pages, 


A completely rewritten and greatly improved revision in an upper-division text. The book attem: ive 
the student a representative picture of the basic facts and theoretical ems in the psychol tos 
There is emphasis on experimental evidence. Theories of learning are treated in the context of 
particular problems, and the theoretical emphasis is upon the analysis of problems rather than upon differ- 
ences betoeen theoretical “schools.” 


PSYCHOLOGY: A STUDY OF A SCIENCE 
(A Seven Volume Inquiry) 


SIGMUND KOCH, Duke University, Editor and Study Director. 


Volume I, SENSORY, PERCEPTUAL, AND PHYSIOLOGICAL FORMULATIONS is now off the press 
pd a $9.75). Volume Il, GENERAL SYSTEMATIC FORMULATIONS, LEARNING, AND SPE- 

OCESSES and Volume Ill, FORMULATIONS OF THE PERSON THE SOCIAL CON- 
TEXT will be ready in the Spring. Over 80 distinguished authors have contributed analytic essays for 
the 7 volume inquiry. The first three volumes consist of the contributions of 36 eminent psychologists who 
in their writing illuminate the b d forces, methods, and ideas that have dete ed the recent 
history of systematic psychology, and at the same time have creatively extended that history. 


PSYCHOMETRIC METHODS 


By J. P. GUILFORD, Ngee of Southern California. McGraw-Hill Series in Psychology. Second 
Edition. 604 pages, $8.75. 


pn ee revised and yoy the — edition of Psychometric Methods presents the same com- 
ive treatment of measurement that ed the first edition. 


— material includes tT on the ‘anon of psych oph theory, mathe- 
ae for an understanding of poe methods, new a pow principles 
of judgment an 


current major ological-test ° throughout is upon the 
ot jdezont and curent major approaches lcoall noone in 


TESTS AND MEASUREMENTS: Assessment and 
Prediction 

By JUM C. NUNNALLY, University of Illinois. Ready for fall classes. 
A comprehensive text for undergraduate course in psychological tests and measurements, and a sup an 
or reference in graduate courses in quantitative methods, testing, and measurement. The boo! 


duces the student to the logical and technical foundations of measurements in all areas of ab. ty with 
the major emphasis placed on the construction, validation, and use of psychological tests. 


Send For Copies On Approval 


McGraw-Hill 
BOOK COMPANY, INC. 
330 West 42nd Street New York 36, N. Y. 




















e Announcing a New Journal e 
Volume 1 @ Numberi @ April 1959 


ENGINEERING 


AND 


INDUSTRIAL PSYCHOLOGY 


Editor: LEE W. COZAN 


arterly pins original’ investigations and occasional} theoretical papers 
ealing with the adaptation of human tasks and the working environment to the 
psychological and physiological attributes of human beings; and the application 
of psychological principles and research methods to the solution of personnel 
management problems in industry, government and military establishment. 


Subscription Rate 
Domestic: $7.00 Overseas: $8.00 
Since Copies: $2.00 


Orders and Manuscripts should be addressed ta: 
ENGINEERING AND INDUSTRIAL PSYCHOLOGY 


Postoffice Box 662 
Washington 4, D. C. 











Seinen 

















Speech and Brain-Mechanisms 
By Wilder Penfield and Lamar Roberts 


This book—the outcome of ten years of work—deals with the cerebral 
organization of speech. The material is largely drawn from the study of patients 
in an active neurosurgical practice. CONTENTS: Introduction on Brain Activity 
(Normal and Epileptic); Functional Organization of the Human Brain, Dis- 
criminative Sensation, Voluntary Movement; The Recording of Consciousness 
and the Function of the Interpretive Cortex; Analysis of Literature; Methods 
of Investigation; Handedness and Cerebral Dominance; Mapping the Speech 
Areas; The Evidence from Cortical Mapping; The Evidence from Cortical 
Excision; Concluding Discussion; Epilogue on the Learning of Languages; 
Bibliography and Case Index. 250 pages. Illustrated. $6.00 


Order from your bookstore, or 
PRINCETON UNIVERSITY PRESS 


Princeton, New Jersey 

















